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0,  Introduction  and  Summary*  This  paper  extends  and  unifies  some 
previous  formulations  ar  *  ♦‘heories  of  estimation  for  one-parameter 
problems.  The  basic  cri  rion  used  is  admissibility  of  a  point 
estimator,  defined  with  reference  to  its  full  distribution  rather 
than  special  loss  functions  such  as  squared  error.  Theoretical 
methods  of  characterizing  admissible  estimators  are  given,  and 
practical  computational  methods  for  their  use  are  illustrated  in 
a  variety  of  examples. 

Point,  confidence  limit,  and  confidence  interval  estimation  are 
included  in  a  single  theoretical  formulation,  and  incorporated  into 
estimators  of  an  "omnibus”  form  called  "confidence  curves,"  The 
usefulness  of  the  latter  for  some  applications  as  well  as  theoret¬ 
ical  purposes  is  illustrated, 

Wisher's  maximum  likelihood  principle  of  estimation  is  general¬ 
ized,  given  exact  (non-asymptotic )  justification,  and  unified  with 
the  theory  of  tests  and  confidence  regions  of  Neyman  and  Pearson. 

i 

Relations  between  exact  and  asymptotic  results  are  discussed. 

An  application  of  the  general  theory  gives  optimal  sequential 
estimators  having  prescribed  precision  in  a  specified  interval. 

Further  developments,  including  multiparameter  and  nuisance  para¬ 
meter  problems,  problems  of  choice  among  wdmissible  estimators, 
formal  and  informal  criteria  for  optimality,  and  related  problems 
in  the  foundations  of  statistical  inference,  will  be  presented  sub¬ 
sequently. 
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1.  A  broad  formulation  of  the  problem  of  point  estimation*  We  con¬ 
sider  problems  of  estimation  with  reference  to  a  specified  experi¬ 
ment  E,  leaving  aside  here  questions  of  experimental  design  includ¬ 
ing  those  of  choice  of  a  sample  size  or  a  sequential  sampling  rule; 
some  definite  sampling  rule,  possibly  sequential,  is  assumed  speci¬ 
fied  as  part  of  E.  Let  S  “{x}  denote  the  sample  space  of  possible 
outcomes  x  of  the  experiment,  I»et  f(x,0)  denote  one  of  the  element¬ 
ary  probability  functions  on  S  which  are  specified  as  possibly  true. 
Let  A*  (q)  denote  the  specified  parameter  space.  For  each  0  in  i  * 
and  for  each  subset  of  A  of  S,  the  probability  that  E  yields  an 
outcome  x  in  A  is  given  by 

Prob  f  X  e  A |C  j  =  |  f(x,Q)  dp(x), 

where  p  is  a  specified  c-  finite  measure  on  5 .  (We  assume  tacitly 
here  and  below  that  consideration  is  appropriately  restricted  to 
measurable  sets  and  functions  only.) 

If  Y  s  y(®)  is  any  function  defined  on  D(e.g.  y(©)  h  0  cr 
2  n 

y(O)  s  0  ),  with  range  I  ,  a  point  estimator  of  y  is  any  measurable 
function  g  -  g(x)  taking  values  in  P (or  in  P,  its  closure,  if,  for 
example,  Pis  an  open  interval).  The  problem  of  choosing  a  good 
estimator,  that  is  an  estimator  which  tends  to  take  values  close  to 
the  true  unknown  value  of  y,  has  been  formulated  mathematically  in 
various  ways.  Host  formulations  achieve  mathematical  definiteness 
by  introducing  criteria  of  closeness  which  appear  somewhat  arbitrary 
from  some  standpoints  of  application  and  undesirably  schematic  as 
expressions  of  the  intuitive  notion  of  closeness. 

If  il  is  given  no  specific  (parametric) structure,  then  the 
latter  features  can  be  fully  avoided  only  by  a  very  broad  formulation 
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which  specifies  only  that  if  y  is  true,  then  an  exaotly  correct 
estimate  (g  ®  y)  is  closer  than  any  incorrect  estimate  (g  J*  y)«  If 
A-  is  finite,  A®  *  and  “  °>  this  leads  t0  th® 

formulation  of  Lindley  [1]  in  which  estimators  are  compared  only 
on  the  basis  of  their  error  probabilities 

Plj  *  Prob  [o*  (X)  ®  OjOj],  i,j,  =  l,...k,  i  t  j, 

where  0  (x)  is  any  estimator  of  0,  This  formulation  has  no  very 
useful  extension  to  typical  estimation  problems  in  which,  fcr 
example ,  JfL  is  an  interval,  and  in  which  the  event  0*(X)  «  0  exactly 
has  typically  negligible  probability  and  little  interest. 

The  case  in  which  A  is  any  set  of  real  numbers,  for  example  an 
interval,  and  y(©)  s  way  be  termed  the  central  problem  of  theory 
of  point-estimation,  although  very  important  generalizations  of 

this  problem  have  been  treated  extensively.  For  this  problem, 

* 

closeness  of  0  to  0  has  been  specified  by  the  introduction  of 
specific  loss  functions:  The  absolute  error  criterion,  |fi  -0|, 
was  introduced  by  Laplace.  Gayss  replaced  this  by  the  squared  ■ 
error  criterion  (0  -0)  which  proved  mathematically  much  more  tract¬ 
able  and  provided  a  definite  formulation  of  the  problem  which  seemed 
equally  reasonable.  A  generalized  squared  error  criterion, 
c(0)«(o  «©)  ,  where  c (0)  is  any  specified  positive  function,  is 
used  in  some  work  in  modern  statistical  decision  theory.  Such 
criteria  are  sometimes  used  in  conjunction  with  the  requirement  of 
unbiasedness,  E(0*(X)|0)  s  Of  this  is  done  (evidently  primarily  to 
facilitate  mathematical  developments)  particularly  in  the  theory 
of  linear  estimation  due  to  Gauss;  this  reduces  the  mean  squared 
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error  criterion  to  a  criterion  of  variances  E((©  -©)  I©]  9 
Var  (©**!©)«  (For  a  brief  account  of  the  history  of  the  theory  of 
point  estimation,  cf.  Neyman  [2],  pp.  9-14*) 

Each  such  definite  specification  of  closeness  can  be  criticis¬ 
ed  as  somewhat  arbitrary,  except  in  a  context  where  one  postulates 
the  reality  of  the  indicated  costs  of  errors  of  each  possible  kind. 
To  avoid  such  features  it  is  evidently  necessary  and  sufficient  to 
adopt  the  following  weak  specification  of  closeness;  If 
or  if  ®Sa2<®l»  estimate  caHe<*  closer  than  ©^  to  0}  if 

©^  <  ©  <  ©g,  no  comparison  as  to  closeness  is  to  be  made.  (The 
latter  point  was  put  forth  by  Galileo  in  an  exchange  which  retains 
interest  in  connection  with  questions  of  formulation. of  estimation 
problems,  particularly  distinctions  between  errors  of  inference 
and  economic  valuations,  and  the  historical  origins  of  unbiasedness 
criteria.  Cf.  (33.) 

This  specification  of  closeness  leads  to  comparisons  between 
estimators  on  the  basis  of  all  of  their  probabilities  of  errors  of 
over-estimation  and  under-estimation  by  various  amounts  d= 

„  f  F(u,0,©*)  5  Prob  {©*(X)s  u|©}  for  u  <  ©, 

&  (u  Q  Q  )  s  ^ 

!_  l-F(u-0, ©,©*“)  s  Prob  {  ©*(X)  g  u|©}for  u  >  6. 

That  is,  estimators  are  compared  only  on  the  basis  of  their  complete 
cumulative  distribution  functions  (c.d.f’s.)  F(u,©,0  )  for  each 
©  e  -ft- ,  rather  than  on  the  basis  of  certain  "summaries"  (functionals 
of  these  c.d.f's  such  as  mean  squared  error.  The  function 
a(u, ©,©**),  defined  for  any  estimator  ©*(x)  at  each  fi  e  fl  fnd  each 
u  j*  fi,  will  be  called  the  risk  curve  of  ©  at  0  (or,  more  precisely. 
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The  family  of  distributions  under  consideration  may  be  viewed 
as  having  a  parametric  structure  only  in  the  sense  that  it  is  order¬ 
ed  by  the  labeling  of  each  funotion  f  (x,0)  of  x  by  a  different  real 
number  a.  Prom  this  standpoint,  the  problem  of  estimating  0  is 
equivalent  to  that  of  estimating  y  ■  y (©)  if  the  latter  is  any 
specified  strictly  monotone  function.  The  formulation  adopted 
above  is  clearly  unaffected  by  (invariant  under)  such  transfor¬ 
mations  of  the  parameter  space  ( -0-  -*>y(fl)  s  p),  as  contrasted 
with  some  other  formulations  referred  to  above. 

A  theory  of  point  estimation  based  on  this  broad  formulation 
seems  appropriate  for  typical  problems  of  inference  occurring  in 
empirical  research,  since  various  kinds  of  errors  of  inference  and 
their  probabilities  admit  simple  direct  interpretations,  whereas 
other  formulations  introduce  specifications  akin  to  costs  of 
various  errors  which  seem  somewhat  hypothetical  or  arbitrary  in 
such  situations.  The  present  theory  also  has  theoretical  and 
technical  relevance  for  estimation  theories  based  on  more  restric¬ 
tive  formulations,  since  it  includes  such  theories  in  a  formal 
sense  which  will  be  elaborated  in  a  following  section. 

2.  Admissible  point  estimators.  An  estimator  0*(x)  of  0  is  natur¬ 
ally  considered  a  good  one  if  its  error-probabilities  are  suitably 
small,  i.e.  if  (the  ordinates  of)  its  risk  curves  a(u,©,0'*),  for 
each  0  e  fi  and  each  u  ^  0,  are  suitably  small.  This  leads  to  a 
natural  partial  ordering  of  estimators,  under  which  some  but  not  all 
pairs  of  estimators  can  be  compared.  As  a  basis  for  systematic 
evaluations  and  comparisons  of  estimators  we  require  the  following 
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Definitions:  For  a  given  estimation  problem,  an  estimator  ©  is 
called  at  least  as  good  as  an  estimator  ©**if  a(u,©,©*)  ga(u,0,©v*) 
for  all  ©  e  ft  and  all  u  ^  If  ©*'  md  ©**  are  each  each  at  least 
as  the  other,  then  a(u,©,©' )  ■  a(u,©,©  ),  and  the  estimators  are 

called  equivalent i  If  neither  of  0  ,  ©*'  is  at  least  as  good  as 
the  other,  the  two  estimators  are  called  not  comparable*  If  0  is 
at  least  as  good  as  O’**  and  if  a(u,0,©*)  <  a  (u, ©,©'“"*)  for  some 
©  e  fiand  some  u  ^  0,  0*  is  called  better  than  O'***  As  estimator 
0*  is  called  admissible  if  no  other  estimator  is  better  than  ©*• 

The  class  of  admissible  estimators  is  called  the  admissible  class. 

A  class  of  estimators  is  called  complete  if,  for  each  estimator 
outside  the  class,  there  is  a  better  one  in  the  class.  The  minimal 
(smallest)  complete  class  .  if  one  exists,  coincides  with  the 
admissible  class.  A  class  of  estimators  is  called  essentially 
complete  if,  for  each  estimator  not  in  the  class,  there  is  one  at 
least  as  good  In  the  class.  A  minimal  essentially  complete  class, 
if  one  exists,  is  a  subclass  of  the  admissible  class. 

The  above  definition  of  admissibility  was  included  in  a  list 
of  criteria  for  point  estimators  by  Savage  (43  (pp. 224 -22 5) $  but  it 
has  not  previously  been  used  systematically. 

The  criterion  of  closeness  of  estimators  introduced  b/  Pitman  (53 
also  deals  with  the  full  c.d,f*s.  of  estimators,  in  the  form  of 
the  joint  distribution  of  each  pair  of  estimators  being  compared ; 
however  this  criterion  does  not  give  a  partial  ordering  of  estimators, 
and  does  not  lend  itself  to  our  present  purposes. 

For  the  probabilities  of  under-estimation  and  over-estimation. 


we  define  also 
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ft(0-,0,©*)  «  Prob  {©*(X)  <  Lio  a(©-ej©,Q*), 

e  — *  0, 

e  >  0 

a(0+,©,©*)  *  Prob  (o*(X)  >  o|©}  *  Lira  a(0+ej  ©,©*). 

e  — >  0, 
e  >  0 

?or  formal  convenience,  we  also  define  a (©,©,©*)  s  0. 

When  reference  to  a  given  estimator  C*  is  understood,  we  may  write 
simply  a(u,©),  a (©-,©),  or  a (©+,©).  The  functions  a (©-,©)  and 
a(©+,0)  of  0  play  a  useful  technical  role,  and  will  be  called 
respectively  the  lower  and  upper  location  functions  of  ©'*, 

In  many  problems,  estimators  for  which  Prob  f©*(X)  =*  ©)©}>  0 
for  some  0  are  found  not  useful.  The  remaining  estimators  have 
continuous  c.d.f*s#,  and  have  a(©-,0)  =  1-a (©+,©).  No  two  such 
estimators,  having  different  location  functions,  can  be  comparable; 
for  a  (©-,©,©*' )  <  a (©-,©,©*“)  is  equivalent  to  a (©+,©,©*)  >  a  (©+,©,©' 
this  shows  that  neither  estimator  is  at  least  as  good  as  the  other. 
The  broad  and  "weak"  definition  of  admissibility  adopted  here 
leads  to  very  large  admissible  classes  in  typical  problems.  However 
It  does  not  seem  unreasonable  to  conceive  of  the  problem  of  point 
estimation  as  one  In  which  the  investigator  chooses  an  estimator  on 
the  basis  of  consideration  of  the  risk  curves  of  all  estimators  in 
some  essentially  complete  class.  In  principle  this  consideration 
should  be  complete,  but  of  course  the  practical  counterpart  of  this 
can  be  at  most  a  more  or  less  extensive  familiarity  with  an  essen¬ 
tially  complete  class,  developed  by  study  of  the  risk-curves  of  a 
variety  of  specific  estimators,  possibly  strengthened  by  some 
general  theoretical  considerations  (including  envelope  risk-curves, 
discussed  below), and  perhaps  also  by  reference  to  one  or  several  loss 
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functions  and  criteria  of  optimality  which  may  seem  more  or  less 
appropriate  in  specific  app3 ic^tions.  Such  an  approach  is  not  so 
difficult  to  carry  out  as  might  be  anticipated,  as  will  be  illua- 
trated.  Of  course  difficulties  of  computation  or  complexity  may 
sometimes  dictate  that  an  inadmissable  estimator  must  be  adopted; 
even  in  such  cases,  the  most  general  basis  on  which  any  particular 
estimator  might  be  justified  as  not  too  inefficient,  is  evidently 
the  comparison  of  its  risk-curves  with  those  of  other  estimators, 
especially  admissible  ones. 

Example.  Let  X  be  normally  distributed  with  unknown  mean  0 
and  variance  1,  withfl=  (o|  -oo  <  0  <oo}  .  Consider,  when  0  =  1, 
the  risk  curves  of  the  classical  estimator  6(x)  =  x,  and  of  the 
estimators  ©^(x)  =  x  +  1  and  ©**''' (x)  s  +ca  We  have 

for  u  <  1,  and 
for  u  >  1, 


for  u  <  1, 
for  u  >  1, 

for  u  <  1, 
for  u  >  1, 

Our  wishful  goal  in  choosing  an  estimator  would  be  to  minimize 
simultaneously  all  ordinates  of  such  ourves,  for  all  0  and  all 
u  /  ®i  since  each  ordinate  is  the  probability  of  an  error.  Of 
course  this  goal  cannot  be  realized  in  non-trivial  problems.  The 


a(u,l,0)  -fl(u-l) 

|  1  -  f(u-l) 


where 


4  rV  V< 


I(v)  =  (2u)  2  J  e  *  dv, 


-00 


a(u,l,©*)  =  fj(u-2) 

1  -  I(u-2) 


and 


oh:-. 


a(u,l,e  )  =j  0 

u 


1 
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estimator  ©  la  superior  to  ©  with  reepeot  to  all  errors  of  under¬ 
estimation!  but  worse  with  respect  to  over-estimation.  Prom  this 
standpoint  neither  can  be  called  better  than  the  other;  they  are 
not  comparable.  The  apparently  trivial  estimator  ©  (but  no 

''smaller"  one)  is  perfect  in  avoiding  errors  of  under-estimation, 

$ 

but  Isas  bad  as  possible  with  respect  to  over-estimation* 

It  will  be  seen  below  that  each  of  these  estimators  is  not 
only  admissible  but  that  each  has,  among  all  estimators  with  the 
same  location  functions,  uniformly  smallest  risk  curves. 

In  most  decision-theoretic  formulations  of  statistical  problems 
a  real-valued  risk  function  r(©,©  )  is  defined  for  each  parameter 
point  and  each  decision  function.  In  the  present  formulation,  we 
associate  with  each  pair  0,  ©  a  set  of  error-probabilities 
a(u,©,©*),  u  ^  ©.  These  respective  error-probabilities,  for  each 
fixed  ©  and  o’*,  may  be  regarded  as  components  of  a  vector  denoted 
by  r (©,©  )  =  ^a(u,©,©")  J  ,  the  components  a(u,©,0‘)  having  index  u. 
Then  r(0fO  )  is  an  example  of  a  vector-valued  risk  function# 

Knowledge  of  the  admissible  class  or  of  an  essentially  complete 
class  of  estimators  in  the  present  broad  sense  can  be  useful  in 
applying  other  formulations  of  the  estimation  problem.  For  example, 
every  estimator  which  is  admissible  with  respect  to  a  squared  error 
loss  function  must  clearly  be  admissible  in  the  present  sense;  hence 
the  search  for  estimators  good  in  the  former  sense  can  be  restricted 
without  loss  to  any  class  known  to  be  essentially  complete  in  the 
broader  sense.  In  this  way,  a  hierarchy  of  definitions  of  admissi¬ 
bility  leads  to  a  corresponding  nested  hierarchy  of  admissible  or 
essentially  complete  classes  of  estimators.  (The  latter  ooncepts, 
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and  that  of  vector-valued  risk  functions,  were  introduced  in  other 
contexts  by  L.  Weiss  [6].) 

3«  Admissible  confidence  limits*  If  0"  s  0"(x)  is  a  point  estimator 
of  0  in  a  specified  problem,  with  the  property  that 
Prob  [©"(X)  <  0  |  0]  *  a (©-,0,0" )  is  relatively  small  for  all  0, 
then  0"  is  an  upper  estimator  of  0.  In  particular,  if  a (0-,0,0" )=a 
for  all  0,  then  0"  is  an  upper  confidence  limit  with  confidence 
coefficient  1  -  a,  or  an  upper  (1-a)  confidence  limit*  Typically 
a  value  (l-a)»«5  is  chosen. 

The  typical  use  and  interpretation  of  an  upper  estimate  is 
the  following;  When  a  given  numerical  value  (observed  value)  is 
obtained  by  use  of  an  upper  estimator,  this  is  taken  as  evidence 
supporting  the  conclusion  or  decision  that  the  true  unknown  value 
is  at  least  as  small  as  the  estimated  value.  Hence  the  merits  of 
any  upper  estimator  depend  upon  the  following  considerations,  in 
suitable  combination: 

(a)  The  probability  should  be  suitably  high  that  the  indicated 
conclusions,  of  the  form;  "0  is  not  greater  than  0"  (x),,!  are  correct 
for  each  possible  true  value  of  0,  That  is,  the  confidence  coeffi¬ 
cient  should  have  a  suitably  large  value;  or,  more  generally,  the 
lower  location  function  a(0-,0,0")  should  have  suitably  low  values 
for  all  0.  Such  properties  are  sometimes  referred  to  by  the  term 
valisliiX#  particularly  in  the  case  of  confidence  limit  estimators; 

a  valid  (1-a)  upper  confidence  limit  estimator  is  one  which  does  in 
fact  have  the  property  that  Prob  {o*  <  Ojo}=  a  for  all  O  e  A  , 

(b)  Given  that  one  of  the  indicated  conclusions  ("0  g  Q»(x)»)  is 
correct,  it  should  be  as  strong  and  informative  a  conclusion  as 
possible;  hence  for  each  possible  true  value  of  0,  the  conditional 
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distribution  of  0"(X),  given  that  ©  g  ©“  (X),  should  be  concentrated 
as  close  to  ©  as  possible.  That  is,  given  the  looation  function 
a(0-,0,0M)  of  any  upper  estimator  0“,  for  each  ©  and  each  u  >  © 
the  values  a(u,©,QM)  *  Prob  [O’1  (X)  j>  u|©]  should  be  suitably  small. 
Such  properties  of  confidence  limits  have  been  termed  accuracy 
properties  by  Lehmann  [7],  p.?8.  Hore  generally,  in  the  theory  of 
confidence  region  estimation,  such  properties  have  been  termed 
shortness  properties  by  Neyman  [8]  » 

(c)  Given  that  one  of  the  indicated  conclusions  ("0  g  0"(x)M)  is 
incorrect  (i.e.  that  in  fact  ©  >  ©•' (x)),  the  indicated  conclusion 
should  be  misleading  in  the  smallest  possible  degree.  For  example, 
in  any  given  problem,  under  any  given  true  value  of  0,  when  an 
upper  estimator  takes  a  value  two  units  below  the  true  value,  the 
indicated  conclusions  (or  inferences  or  actions  or  decisions)  are 
at  least  as  erroneous  (or  inappropriate)  and  in  general  more  so, 
than  when  an  upper  estimator  (with  the  same  confidence  coefficient 
or  location  function)  takes  a  value  which  is  only  one  unit  below 
the  true  value.  That  is,  given  the  location  function  a (©-,©,©" ), 
for  each  0  and  each  u  <  ©  the  values  a(u,©,©"  )  should  be  suitably 
small.  This  property  has  evidently  not  previously  been  discussed 
along  with  those  oJ  validity  and  shortness,  but  it  seems  necessary 
to  include  it  for  a  complete  specification  of  the  practical  purposes 
and  intuitive  goals  of  confidence  limit  estimation.  All  three 
properties  are  given  some  weight  in  a  specific  loss  function  adopted 
in  the  decision-theoretic  treatment  of  Wolfowitz  £ 93 • 

These  considerations  lead  in  the  usual  way  to  definitions  of 
admissibility  and  of  complete  classes  of  upper  and  lower  estimators. 
Properties  (b)  and  fe)  together  are  formally  identical  with  the 
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closeness  properties  considered  in  the  preceding  section  for  point 
estimators,  while  property  (a)  by  itself  is  merely  descriptive  of 
the  location  function  of  a  point  estimator.  Thus  every  admissible 
confidence  limit  estimator  is,  formally,  an  admissible  point 
estimator  as  defined  above,  and  is  contained  in  every  complete 
class  of  point  estimators. 

Hence  there  is  no  necessary  formal  distinction  between  the 
formulations,  theories,  and  practical  techniques  of  point  estimation 
on  the  one  hand  and  of  confidence  limit  estimation  on  the  other: 
the  distinctions  required  here  are  only  those  of  qualitative 
emphasis  and  quantitative  degree  which  reflect  the  variety  of  possi¬ 
ble  purposes  for  which  a  point  or  confidence  limit  estimator  may 
be  chosen  from,  say,  An  essentially  complete  class.  For  example, 
in  choosing  an  upper  estimator  for  a  given  application,  it  may  be 
judged  that  property  (c)  above  should  be  given  no  weight  as  com¬ 
pared  with  properties  (a)  and  (b)  because  "a  miss  is  as  good  as 
a  mile"  in  the  given  context  of  application;  in  other  contexts, 
including  probably  most  cases  of  estimation  for  informative 
inference,  some  weight  may  be  given  to  each  property. 

Admissible  Interval  estimators.  If  J  *  J(x)  «=  (0*,©")  *  (0*  (x), 
0"(x))  is  a  pair  of  point  estimators  such  that  O’  (x)  g  0"(x)  for 
each  x  in  S,  then  J  is  an  Interval  estimator  of  0.  In  partluclar, 
if  Prob  {o*  (X)  g  0  g  0"  (X)|oj  =  1-a  for  each  0,  then  J  is  a  con¬ 
fidence  interval  with  confidence  coefficient  1-a,  or  a  (1-a) 
confidence  interval.  (Typically  a  value  (1-a)  »,£  is  chosen,)  The 
typical  use  and  interpretation  of  an  upper  estimate  is  the  following: 
When  given  numerical  values  0*  aid  0"  are  obtained  by  use  of  an 
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Interval  estimator,  this  is  t  iken  as  evidenoe  for  the  conclusion 
that  the  true  unknown  value  of  the  parameter  ©  lies  in  the  closed 
interval  [  ©  ’ ,  ©"  ] . 

The  probability  properties  of  any  interval  estimator  J  may  be 
described  in  the  following  terms:  It  is  natural  to  call  a (0-,©,©“ ) 
the  lower  location  function  of  J  (as  well  as  of  0"),  and  to  denote 
it  when  convenient  by  a(Q-,©^J)|  similarly  a (©+,©, J)  *  a(0+,©,©») 
is  the  upper  location  function  of  J.  As  with  point  estimators, 
these  functions  give  respectively  the  probabilities  of  under¬ 
estimation  and  of  cverest imat ion  when  a  given  interval  estimator  J 
is  used.  For  example,  it  is  natural,  to  call  J  a  med lan-unbias ed 
interval  estimator  if  for  each  0  we  have  equal  probabilities  of 
overeat imat ion  and  underestina tion:  a(©-,0,J)  =  a(Q+,©,J)#  This 
usage  is  compatible  with  the  definition  of  a  median-unbiased  point 
estimator. 

A  quantity  of  primary  interest  is  the  probability  that  the 
conclusion  indicated  by  any  interval  estimator  J  ("©  lies  in 
(Oi  jO1*  ]"  )  will  be  incorrect,  for  each  possible  true  value  0,  This 
probability  is  just  the  sum  of  the  location  functions  of  J: 

Prob  {©  not  covered  by  J(X)jo}  =  Prob  {©'•  (X)  <  ©I©} 

+  Prob  {©(X)  >©)©}  =  a(©-,©,J)  +  a(©+,©,J). 

If  this  probability  equals  a  for  each  ©,  then  J  is  a  (1-a)  confi¬ 
dence  intervelj  if  in  addition  J  Is  median-unbiased,  then  ©*  and 
©”  are  (l-^a)  confidence  limits.  As  with  point  aid  confidence  limit 
estimators,  it  is  of  interest  in  general  to  consider  the  probabili¬ 
ties  of  errors  of  under-estimation  and  of  over-estimation  of  various 
magnitudes  in  Interval  estimation!  we  denote  these  probabilities  by 
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a(u,0,J)  «  fa(u,©,0‘)  for  each  u  >  0, 

^a{u,©,0" )  for  each  u  <  ©. 

In  a  formal  sense,  a  point  estimator  may  be  regarded  as  an 
interval  estimator  J  *  (0*,  ©H )  having  the  special  form*  O' (x)  * 
©*'(x)  for  all  x.  The  full  specification  of  what  is  meant  by  a  good 
point  estimator  0*,  by  use  of  the  risk  curves  a{u, ©,©*),  corresponds 
to  the  use  of  the  functions  a(u,©,J)  to  specify  at  least  part  of 
what  is  meant  by  a  good  interval  estimator  J. 

Again,  in  a  formal  sense  an  upper  estimator  O'*  (x)  may  be 
regarded  as  an  interval  estimator  J  =  ( 0 f , ©** )  having  the  special 
forms  0»  (x)  s  ©  *  the  greatest  lower  bound  of  jfl,  for  all  x.  The 

full  specification  cf  what  is  meant  b;  a  good  upper  estimator  ©H , 
by  use  of  the  risk  curves  a(u, ©,©"),  corresponds  to  part  of  what 
is  meant  by  a  good  interval  estimator;  in  particular,  small  values 
of  a(u,©,0")  for  u  >  ©,  which  indicate  desirable  properties  of 
accuracy  or  shortness  for  an  upper  estimator  ©" ,  indicate  corre» 
sponding  shortness  properties  for  an  interval  estimator  J  *  (©•,©")* 

The  merits  of  any  interval  estimator  J  depend  upon  the  follow¬ 
ing  considerations  in  suitable  combination* 

(a)  The  probability  should  be  suitably  high  that  the  indicated 
conclusions  ("©  lies  in  [©’,  ©*’  ]”)  are  correct,  for  each  possible 
true  value  of  0.  That  is,  the  confidence  coefficient  should  have 
a  suitably  high  value;  or,  more  genera:  ly,  for  each  ©,  the  sum  of 
the  location  functions  a(©-,0,J)  and  a(©+,0,J)  should  be  suitably 
low.  As  with  point  estimators,  it  seems  desirable  to  avoid,  as  far 
as  possible  and  convenient  in  the  development  of  a  general  theory, 
any  step  which  corresponds  to  a  tacit  judgment  that  errors  of  over¬ 
estimation  and  underestimation  are  necessarily  comparable  either 
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qualitatively  or  quantitively .  Hence  the  present  specification 
will  be  given  the  form*  Eaoh  of  the  location  functions  a(0-,0,J), 
a(©+,o,J)  should  have  suitably  small  values,  for  each  0 • 

(b)  Given  the  location  functions  of  an  interval  estimator  (and, 
hence,  given  the  probability  1-a (©-,©, J)  -  a(0+,0,J)  of  correct 
conclusions,  for  each  ©),  the  indicated  conclusions  should  when 
correct  be  as  strong  and  informative  as  possible.  That  is,  for 
each  ©,  the  conditional  distributions  of  0*  (X)  and  Q"(X),  given 
that  O(X)  §  0  i  0"  (X),  should  be  concentrated  as  close  to  0  as 
possible.  (In  terms  of  the  conditional  bivariate  distribution  of 
(0*(X),  0”  (X),  this  means  concentration  close  to  the  point  (0,0)*) 
These  desirable  shortness  properties  of  J  correspond  to  suitably 
small  values,  for  each  0,  of  a(u, ©,©'’)  for  each  u  >  ©  and  of 
a(u,0,0i)  for  each  u  <  ©. 

(c)  Given  that  one  of  the  conclusions  indicated  by  J  is  incorrect, 
it  should  be  misleading  in  the  smallest  possible  degree*  (The 
remarks  on  property  (c)  of  the  preceding  section  are  also  applicable 
here.)  These  desirable  closeness  properties  of  J  correspond  to 
suitably  small  values  of  a(u,fi,J)  for  each  0  and  each  u  ^  0;  that 
is,  suitably  small  values  of  a(u,©,©’)  for  u  >  0  and  of  afu,©,©" ) 
for  u  <  0. 

To  represent  all  of  the  properties  considered  for  interval 
estimators,  we  define  the  risk  curves  of  each  interval  estimator 
J  »  (©!,®")»  at  each  ©,  as  the  pair  of  functions  {afu,©,©1), 
a(u,©,©”))  of  u(u  ?  ©),  i.e.  the  risk  curves  of  ©'  and  of  ©» ,  Thus 
the  risk  curves  of  J  at  ©  are  a  representation  of  the  bivariate 
cumulative  distribution  function  of  ©’(X)  and  0” (X)  when  ©  is  true* 
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The 8«  considerations  lead  us  to  formulate  the  following  basio 
definitions t  An  interval  estimator  J  *  (©*,0")  will  be  called  at 

ji  it  JUt 

least  as  good  as  another  J  ■  (©  ,©  )  if  ©*  is  at  least  as  good 

it  mm 

as  ©  and  ©H  is  at  least  as  good  as  ©  in  the  sense  defined  for 
point  estimators  in  Section  2  above •  Similarly,  J  will  be  oalled 
better  than  J*  if  it  is  at  least  as  good  as  and  also  ©*  is 
better  than  ©"'  and/or  0”  is  better  than  ©**^#  J  will  be  oalled 
admissible  if  no  other  interval  estimator  is  better.  Complete 
olasses  are  defined  in  the  usual  way. 

If  two  interval  estimators  have  different  location  functions, 
they  are  not  comparable  (neither  is  at  least  as  good  as  the  other )} 
this  follows  immediately  from  the  corresponding  property  for  point 
estimators.  A  simple  sufficient  oondition  for  admissibility  of 
J  »  (©*,©")  is  that  ©*  and  0"  be  admissible  point  estimators# 

5#  Confidence  ourve  estimators.  The  selection  of  an  estimator  of 
one  of  the  above  kinds  for  purposes  of  informative  inference, 
including  typical  applications  in  scientific  research,  is  generally 
admitted  to  Involve  elements  of  choice  which  are  in  some  degree 
arbitrary*  Such  elements  include  the  choice  of  a  particular 
confidence  level  for  an  interval  estimator,  and  the  chioice  of 
location  functions  for  an  interval  estimator  with  given  confidence 
coefficient.  In  addition,  a  point  estimate  is  sometimes  desired 
along  with  an  interval#  Suoh  considerations  and  related  ones  have 
led  to  proposals  for  use  simultaneously  of  a  point  estimator  and  a 
set  of  oonfidenoe  limit  or  interval  estimators  having  various 
confidence  coefficients#  Such  estimators  may  be  regarded  as  a 
mod* ra  formulation  of  a  long-standing  praetioe  of  reporting 
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estimates  In  the  fora  0*  t  k  cTq$  where  k  la  some  eonatant  and 
°*q#  *  Var  (0  (X) ) •  The  latter  form  may  be  interpreted  as  an 
ordered  set  of  three  point  estimators#  For  example,  if  Q  (X)  has 
a  normal  distribution  with  a  known  oonstant  variance,  and  k  *  1, 
then  the  "estimator"  d**(x)  -  kfljj*  may  be  written  as  the  ordered 
set  of  estimators 

I0*(x)  -o-a»,  «*(x),8*(x)  +  05*]  S  [«(x,«84),e(x,.S),  0(x,.16)l. 


Estimates  of  this  "omnibus"  kind  oan  be  Interpreted  flexibly  but 
validly,  in  any  context  of  application  for  informative  inferences, 
in  the  ways  customary  for  (a)  point  estimates  such  as  ®(x,»5), 

(b)  confidence  limits  such  as  0(x,*84)  and  Q(x,»l6),  and  (c)  con¬ 
fidence  intervals  such  as  £  ©  ( x  ,  •  8i|. ) ,  ©(x,*l6)]. 

Tukey  [10]  proposed  that  for  typical  general  purposes  it  would 
be  advantageous  to  use  a  set  of  five  point  estimators  at  standard 
levels*  0(x,a),  with  a  *  2^  %  ,  16 |°/6  ,  $0°/>  ,  B2^06  ,  and  97%°/>  • 
Cox  [11]  proposed  use  of  the  full  continuous  family  of  confidence 
limits  0(x,a),  0  |  a  §  1,  Such  an  omnibus  estimator  includes 
formally,  as  elements,  not  only  oonfidenoe  limits  at  all  levals 
and  a  median-unbiased  point  estimator,  but  also  median-unbiased 
confidence  intervals  at  all  levels#  Whether  suoh  estimators  should 
be  used  in  practice,  rather  than  more  standard  methods,  is  a  matter 
of  judgment  and  taste  whioh  oan  perhaps  be  decided  best  in  specific 
contexts  of  application*  It  is  often  convenient,  as  will  be 
illustrated  below,  to  discuss  estimation  theory  and  techniques  for 
estimators  of  this  omnibus  form,  since  suoh  discussion  inoludes 
conveniently  and  compactly  a  treatment  of  estimators  of  the  various 
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kinds  mentioned. 

Any  such  estimator,  consisting  of  a  specified  set  of  oonfidenoe 
limit  estimators  fi(x,a),  a  in  some  specified  subset  of  the  closed 
unit  interval  (possibly  the  whole  interval),  ordered  in  the  sense 
that  a  <  a*  implies  Q(x,a )  >  G(x,at)  for  each  x  in  S,  will  be 
called  a  oonfidenoe  curve  estimator.  We  shall  usually  consider 
the  inclusive  oase,  0  g  a  g  1,  so  as  to  inolude  formally  all  other 
oases*  In  many  problems  it  is  convenient  to  give  such  estimators 
a  form  which  can  be  reported  graphically :  if  for  each  x  a  S,  0(x,a) 
increases  continuously  from  0  to  <5  as  a  decreases  from  1  to  0,  then 
we  define  the  confidence  curve  estimator  c(0,x),  for  each  x  e  S, 
as  the  continuous  curve  (function  of  0  e/v) 

o(0,x)  **  min  [a,l-a|©(x,a)  «  0]  * 

For  example,  if  X  is  normally  distributed  with  unit  variance  and 
mean  Q,  then  the  confidence  ourve  estimator  of  Q  is 

(J(0  -  x),  -oo  g  G  g  x, 

1  -  ]|(o  •  x),  x  £  S  g  oo$ 

for  any  observed  value  x,  the  estimate  o(9,x)  can  be  described  by 
a  more  or  less  complete  sketch  of  its  graph  when  convenient*  Such 
estimates  are  illustrated  in  a  number  of  examples  in  Section  9 
below* 

The  definitions  of  admissibility  and  of  complete  o lasses  for 

oonfidenoe  ourve  estimators  parallel  those  above  for  oonfidenoe 

# 
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Interval  estimators.  A  simple  sufficient  (but  not,  in  general, 
necessary)  condition  that  a  confidence  ourve  estimator  be  ad¬ 
missible  is  that  for  eaoh  a,  its  element  o*(x,a)  be  an  admissible 
point  estimator*  In  problems  for  whioh  there  exists  a  uniformly 
best  oonfidenoe  limit  estimator  for  eaoh  confidence  coefficient, 
this  condition  is  necessary  as  well  as  sufficient,  and  there  is  a 

unique  (a*e*)  admissible  oonfidenoe  ourve  estimator  whioh  oonsists 

* 

simply  of  the  family  of  these  best  confidence  limit  estimators* 

6*  Elementary  theory  of  admissible  point  estimators.  An  important 
part  of  the  general  theory  of  admissible  point  estimators,  and  of 
corresponding  practical  techniques  of  estimation,  can  be  developed 
conveniently  by  an  essentially  elementary  use  of  the  theory  of 
tests  of  one-sided  hypotheses  as  originated  by  Neyman  and  Pearson 
and  as  extended  (by  simple  use  of  their  Fundamental  Lemma)  to 
generate  a  variety  of  admissible  tests  of  such  hypotheses.  In 
problems  for  which  uniformly  best  one-sided  tests  exist,  the  com¬ 
plete  theory  of  admissible  estimators  is  obtained  in  this  way|  for 
other  problems,  the  development  of  the  remaining  parts  of  the 
theory  requires  more  general  methods  introduced  in  Section  10  below* 
For  eaoh  0Q  in  we  consider  two  one-sided  testing  problems: 
(a)  the  problem  of  testing  the  hypothesis  H(0o):  0  g  0Q  (against 
the  general  alternative  H»(©0):  0  >  00)j  and  (b)  the  problem  of 
testing  H(©0«):  Q  <  Qq  (against  the  general  alternative  H*(©0-)* 

0  ©0 ) •  In  case  ©0  is  a  minimum  value  in  S\,  consideration  of 

H(O^)  is  to  be  oramitted;  if  dQ  is  a  maximum  in  .A*  H(0q)  is  omitted* 
Any  given  point  estimator  Q *  ■  ©^(x)  of  0  oan  be  used  in  the 
following  way  to  define  a  test  of  eaoh  of  the  hypotheses  mentioned: 
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Aooept  the  hypothesis  if  and  only  if  the  observed  value  ©  (x)  ia 
consistent  with  the  hypotheala*  Suoh  a  test  of  the  hypothesis 
H(©Q)  has  the  aoceptanee  region  A(0Q)  ®  ^xf©*(x)  g  ©0^  $  suoh  a 
test  of  H(©0-)  has  aoceptanoe  region  A(©0-)  *^x|0*(x)  <  ©0j>  • 

If  01  <  ©2,  then  A (0^)  c A(©x)  c  A(©ft-)  A(©2);  for  brevity,  we 

shall  say  that  suoh  a  sequenoe  of  sets  A(©)  is  nondeoreasing  in  ©, 
with  the  understanding  the  argument  ©  may  take  a  value  (©-)  which 
is  considered  smaller  than  ©  and  larger  than  ©-a  for  eaoh  positive 
e* 

Suoh  a  test  of  H(©0»)  has  probabilities  of  errors  of  Type  I 
given  by 

1  -  Prob  (A($0-)|©)  *  a(©0,©,©v )  for  eaoh  ©  <  ©Q, 
and  of  Type  II  given  by 

Prob  (A(©o-)|0)  *  a(©0-,©,©*)  for  eaoh  ©  £  ©Q  • 

Such  a  test  of  H(©Q)  has  probabilities  of  errors  of  Type  I  given 
by 

1  -  Prob  (A(©0)|©)  ■  a(©Q  +  ,©,©^)  for  each  ©  g  ©0  , 
and  of  Type  II  given  by 

Prob  (A(©0)|©)  •  a(©0,©,©*)  for  eaoh  ©  >  ©0  • 

Thus  eaoh  of  the  error-probabilities  a(u ,©,©*),  upon  whioh  depend 
the  admissibility  of  any  given  point  estimator  ©*,  appears  as  an 

error-probability  of  a  test  of  a  one-sided  hypothesis  based  upon 

* 

use  of  ©’"•  These  relationships  provide  the  following  simple 
sufficient  oondition  for  admissibility  of  a  point  estimator* 

Lemma  I*  For  any  specified  family  of  probability  density  functions 
f(x,0)  (with  respect  to  an  underlying  cr-  finite  measure  y(x)  defined 
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on  the  sample  spaoe  S  *{x^  ),  ©  e  A(«  subset  of  the  real  line), 
a  given  estimator  ©*  *  ©**(x)  (any  measurable  function  taking 
values  in  the  closure  jv of  IX )  is  admissible  if  each  of  the  accept¬ 
ance  regions  A(©q),  A(©Q-),  based  on  ©,J  as  defined  above,  gives  an 
admissible  test  of  the  corresponding  one-sided  hypotheses 

H(©  ),  H(©  -)  defined  above. 

O'  o 

Proof:  (A  test  is  called  admissible  if  no  other  test  has  all  error- 
probabilities  at  least  as  small,  with  at  least  one  strictly  smaller*) 
If  ©*  satisfies  the  assumptions  of  the  Lemma  but  is  inadmissible, 
let  0  be  an  estimator  better  than  ©  •  Then 
a(©0,©,©*H)  jg  a(©0,©,©'*)  for  each  ©  e  Aand  each  ©Q  ^  ©,  and  the 
inequality  is  strict  for  some  ©  »  0*  c  A-and  some 
©Q  88  Qo  e  A*  £  ©*•  Assume  for  definiteness  that  ©£  >  ©*  (the 
other  case  can  be  disoussed  in  the  same  way).  Then  the  acceptance 
region  £x  | ©’''*“ (x)  <  ©£  jgives  a  better  test  of  the  hypothesis  H(©£-) 
than  does  ^x|©'(x)  <  ©^  jf  •  This  contradicts  the  assumed  admissi¬ 
bility  of  the  test  based  on  the  latter  region,  completing  the  proof. 

Many  estimators  of  interest  can  be  conveniently  investigated 
theoretically  and  constructed  practioally  by  the  device  of  using  as 
indicated  below  a  function  v(x,©),  defined  for  each  sample  point  x 
and  each  ©  e  j*V*  If#  for  each  fixed  ©,  v(x,©)  is  a  measurable 
function  of  x,  it  is  a  statistic :  and  as  ©  varies,  v(x,©)  represents 
a  family  of  statistics.  We  term  such  a  function  v  a  quaslstatlstlo. 
Corollary  1.  A  sufficient  condition  for  admissibility  of  an  estim¬ 
ator  ©**{x)  is  that  it  be  defined,  for  each  x,  as  the  solution  ©  of 
the  equation  v(x,©)  *  0,  where  v  is  a  quaslstatlstlo  suoh  that: 
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(a)  For  eaoh  x  In  S,  v(x,©)  «  0  holds  for  a  unique  0  inn.. 

(b)  If  ^  <  ©2  and  ©x,  ©2  are  In  A#  then  ^xlvtx,©^  <  0  j 

Hx1vU,82)  *  °i* 

(A  simple  sufficient  condition  for  (b)  is  that  for  eaoh  x,  v(x,©) 
be  nonincreasing  In  ©•) 

(c)  For  eaoh  ©0  in  -TL,  the  acceptance  regions  £x|v(x,©Q)  s  oj  and 
£x|v(x,©0)  <  0 j.  are  admissible  respectively  for  testing  the  one¬ 
sided  hypotheses  H(©0)  and  H(©0«)# 

Proof:  If  v(x,©)  satisfies  the  stated  conditions,  the  conclusion 
follows  immediately  from  Lemma  1  upon  observing  that 

«£x|v(x,©0)  S  o  Ja  |x[©*(x)  g  ©^  ^and  |x|v(x,©0)  <  £cf©*(x)  <  ©JJ 

When  an  estimator  ©v  is  defined  implicitly,  by  use  of  a  quasi- 
statistic  v(x,©),  as  the  solution  ©  of  the  equation  v(x,©)  *  0,  in 
applications  it  is  not  necessary  to  have  an  explicit  formula  for 
©*'(x)  since  for  any  observed  sample  point  x  it  suffices  merely  to 
determine  the  corresponding  root  ©  of  the  defining  equation^  and  in 
the  cases  of  many  such  estimators  of  practical  and  theoretical 
interest,  no  explicit  formula  for  ©"(x)  is  available#  The  pre¬ 
ceding  lemma  shows  that  basio  qualitative  properties  of  efficiency 
can  be  established  for  such  estimators  without  use  of  any  explicit 
formula  for  ©v(x)«  Their  quantitive  properties  can  also  be 
determined  without  suoh  explicit  formulas:  Since  v(x,u)  <  0  is 
equivalent  to  ©**(x)  <  u,  and  v(x,u)  ■  0  is  equivalent  to 
©*(x)  ■  u,  we  have 
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iProb  [©*(X)  g  u|©]  *  Prob  [v(X,u)  g  0|©]  f or  u  <  © 

Prob  [©^(X)  >  uj©]  *  Prob  [v(X,u)  >  0|©]  for  u  >  Q# 

Thus  all  quantitative  properties  of  such  estimators  ©  oan  be  de- 

termined,  when  convenient,  by  determining 

Prob  [v(X,u)  g  0|©3  and  Prob  (v(X,u)  =  0|©]  for  each  u  £  ©• 

Some  theoretioal  properties  of  such  estimators  are  also  con¬ 
veniently  treated  in  terms  of  the  o#d#f*s#  of  v*  For  example,  if 
for  each  n  *  1,  2,###,©^  is  an  estimator  determined  by  a  quasi- 

statistic  v  *  v  (x  .©),  then  the  condition  that  the  sequenoe  of 
n  n  n' 

estimators  ©!*.  be  consistent  (that  is.  that  Lim  a(u .©.©*)  *  0.  for 
n  ■  >  "  >  *  n  9  9  n  9 

each  ©  e/\and  each  u  £  ©),  can  be  stated,  and  in  many  cases  con¬ 
veniently  proved,  in  the  form:  Lira^  Prob  [vn(Xn,u)  $  0|©]  =  0  or  1, 
according  as  u  <  ©  or  u  >  ©,  for  each  ©  e  A# 

For  estimation  by  confidence  intervals  or  confidence  ourves, 
it  is  sometimes  convenient  to  employ  a  family  of  quasistatistics# 
Suppose  that  for  each  of  several  values  of  an  index  a,  v(x,©,a)  is 
a  quaslstatlstio  which  determines  as  above  an  estimator  0(x,a),  and 
that,  for  eaoh  x  in  S,  0(x,a)  is  decreasing  in  a#  Then  for  any 
pair  of  values  of  a,  at  >  a”,  the  pair  of  estimators 
[©(x,a»),  0(x,an)3  *  J(x)  is  an  interval  estimator  of  ©,  whose 
quantitative  properties  may  be  investigated  in  terras  of  the  dis¬ 
tributions  of  v(X,u,a)  as  indicated  above,  and  whose  admissibility 
oan  in  some  oases  be  established  by  direot  application  of  Corollary  1 
to  v(x,d,a*)  and  v(x,0,a" )#  A  case  of  interest  is  that  in  which 
o  »  Prob  (v(X,©,a)  g  0},©]  s  Prob  (v(X,©,a)  <  0|©J  for  each 
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a,  0  s  a  g  1,  and  aaoh  Q  t  S' U  Then  the  family  of  estimators 
0(x,a)  constitutes  a  confldenoe  ourve  estimator  of  ©  (assuming 
again  that  v(x,©,a)  is  deoreasing  in  a)*  this  estimator  is  admissible 
if  for  eaoh  a  the  quasistatistio  v(x,Q,a)  satisfies  the  assumptions 
of  Corollary  1*  Examples  of  suoh  estimators,  and  of  convenient 
techniques  for  their  computation  and  presentation,  are  given  below* 

7m  Uniformly  best  estimators*  Let  ©'^(x)  be  any  estimator  of 
©  c  -H l*  0 *  will  be  called  a  uniformly  best  estimator  of  ©  if,  among 
all  estimators  with  the  same  location  functions  a(©-,0),  a(©+,0),  ©*' 
has  uniformly  minimum  error-probabilities  a(u,©)*  Since  the 
a(u,0)'s  are  error-probabilities  of  tests  of  one-sided  hypotheses 
H(©0~),  H(©Q),  ®0  e  -A.#  with  respective  acceptance  regions 
A(©0-)  *  (x|©*(x)  <  ©0ji  ,  A(0Q)  -f*|Q*(x)  %  ©0J>  ,  a  necessary 
condition  for  ©*  to  be  a  uniformly  best  estimator  is  that  f(x,0) 
and  y-L.  admit  uniformly  best  tests  of  the  hypotheses  H(©Q-),  H(©0), 
of  respective  sizes  a(©Q-,  ©Q,  ©"),  1  -  a(©0+,  ©Q,  ©*),  0Q  e  J"\_* 

It  is  well  known  [12]  that  uniformly  best  one-sided  tests  of 
all  sizes  exist  if  and  only  if  there  exists  a  sufficient  statistic 
t(x)  with  the  monotone  likelihood  ratio  (m.l*r*)  property,  in  which 
case  each  best  test  may  be  obtained  by  use  of  an  acceptance  region 
of  the  form 


A(©0-)  *  {U,y)|  z(t(x),y,©0)  g  a(©0-,©0)3  or 
A(©0)  *  £(x,y)|  z(t(x),y,oo)  g  l-a(©0+,©0) J  , 


where  Y  is  the  observed  value  of  *  m&fwx&y  distributed  auxiliary 
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randomization  variable  y,  0  g  Y  <  1,  and  Z  is  ths  continuous 

probability  Integral  transform  of  Ys 

a(t(x),y,©)  =  yF(t(x),C)  +  (l-y)F(t(x)-,©),  where 

F(t,©)  »  Prob -ft (X)  g  t|©J#  If  suoh  a  sufficient  statistic  t(x) 

exists,  then  a  simple  sufficient  condition  for  admissibility  of  an 

estimator  ©’*  is  clearly  that  0*  be  a  non-decreasing  function  of 

t(x);  for  then  A(©Q-)  =  £xf©*(t(x))  <  ©QJ  and 

A(©q)  ®{x|©*(t(x))  g  ©Qj  are  uniformly  best  one-sided  tests.  If 
such  a  statistic  t(x)  has  a  disore te  distribution  on  a  subset  of  the 
integers,  then  t(x  )  +  y  is  another  sufficient  statistic  having 
the  monotone  likelihood  ratio  property,  and  having  a  continuous 
c#d»f«  tinder  each  ©;  as  above,  a  simple  sufficient  condition  for 
admissibility  of  an  estimator  O'*  is  that  it  be  a  non-decreasing 
function  of  t(x)  +  y* 

More  generally,  let  0*  be  arj^  estimator,  let 
G(©)  *  Prob  £©*(x)  g  ©[©  J,  let  G(©-)  *  Prob  ^0*(X)  <  ©f©j,  let 
F(t,0)  *  Prob  ^t(x)  g  t|©^  ,  where  t(x)  is  a  sufficient  statistic 
with  the  m«l«r«  property,  and  as  above  let 

z(t(x),y,©)  =  yF(t(x),0)  +  (1-y )F(t(x)-,o),  Consider  the  quasi- 
statistio  v  =*  v(x,y,0)  »  z(t(x),y,©}  -  G(©),  For  eaoh  ©Q, 

A(©0)  9  *^(x,y )  |v(x,y,©Q)  <  oj.  is  clearly  a  uniformly  best  accept¬ 
ance  region  for  testing  H(©ft)  at  level  1-G(©  )  *  a(©  +,©  ,©*), 
Consider  the  quasistatistio  v*  =  v*(x,y,0)  ■  z(t(x),y,©)-G(©-) 
g  v  +  [G(0)  -G( ©- )  ]  •  For  eaoh  ©0,A(©0-)  »  £(x,y ) fv»(x,y,©0)  <  oj 

is  clearly  a  uniformly  best  aooeptanoe  region  for  testing  H(©  -); 

o 

at  0  9  ©0  it  has  Type  II  error  probability  G(©0-)  *  a(©0-,©0,©*). 
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To  verify  that  these  aooeptanoe  regions  oonstitute  a  sequence 
of  sets  whioh  is  nondeoreasing  in  ©  in  the  sense  defined  in 
Seotion  6,  we  note  that  obviously  A(©0-)  cA(«Q),  and  we  proceed 
to  prove  that  0^  <  ©2  implies  A(©1)  C  A (©2-):  Assume  that 
(x*,y*)  t  A(d1)j  but  (x»,y»)  j£  A(©2-)>  then 

z«  s  <  G(©1)  and  z”  s  z(t(x* ),y »,©2)  g  G(©2«).  A 

best  test  of  H(©1)  of  size  (l-z»)  (the  test  which  rejects  when 
zCtCx^y,©^)  >  z»)  has  maximum  power  at  ©  ■  ©2,  namely  l-zn;  the 
test  with  aooeptanoe  region  £x  |©*(x)  s  o^j  has  size 
1  -  G(©1)  <  (1-z*)  and  hence  has  power  Prob  £©*(X)  s»  *  **  *w 

Henoe  z"  <  Prob  {©"'(X)  g  ©1|©2j.g  Prob  £©*(X)  <  ©£j  »  G(©2«),  a 
contradiction  whioh  proves  that  A(O^)  <=■  A(©2-), 

For  each  (x,y),  let  ©,K  *  ©**'(x,y)  be  defined  by 
©*^(x#y)  »  inf  £©|©  eiT,  (x,y)  e  A (©)j  .  Then  ©**  is  a  non¬ 
decreasing  function  of  t(x)  and  of  y,  and  is  a  uniformly  best 
estimator  having  the  same  location  functions  as  the  arbitrarily 
given  ©"•  If  each  best  test  is  admissible,  then  ©**  is  admissible* 

and  hence  is  strictly  better  than  ©*  or  else  it  is  equivalent  to 
£ 

©  •  These  considerations  establish  the  following 
Lemma  2.  If  the  family  of  density  functions  f(x,0),  ©  e  /l*  admits 
a  sufficient  statistic  t  *  t(x)  having  the  monotone  likelihood  ratio 
property,  then  an  essentially  complete  class  of  estimators  is  con¬ 
stituted  by  estimators  of  the  form  Q*  »  ©*(t,y),  any  nondecreasing 
function  of  t  and  of  y,  where  y  is  an  observed  value  of  an  auxiliary 
randomization  variable  Y  having\under  eaoh  ©  the  same  uniform  dis¬ 
tribution  on  the  unit  Interval  0  §  y  <  1*  and  suoh  that  t*  <  t” 
implies  ©*(t»,y»)  g  ©*(tw*y»)  for  all  y>,y”# 
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If  t(x)  has  a  continuous  o.d.f,,  for  saoh  0,  than  estimators 
of  this  form  but  dot  depending  upon  7  oonstitute  an  essentially 
complete  olass  of  estimators* 

8*  Soore  quaaistatistioa  and  generalized  maximum  likelihood 
estimators. 

For  a  given  family  f(x,©),  ©  e  j\t  let  ©^0),  ©2(0)  be  two 
functions  defined  on  IX,  taking  values  in  Jlp  and  satisfying 
©^(©)  <  9^(9)  and  0^(0)  ©  s  ©2^®^  ^or  ®  e  Then  for  each 

©*  e  JX9  a  best  test  of  H^i  9  «  9^(Qt)  against  H2:0  •  ©2(©»)  is  one 
which  accepts  when  the  quasistatistio 

S(x,01(©),©2(©))  s  [log  f(x,©2(©))  -  log  f(x,©1(©))]/[©2(©)-©1(©)] 

satisfies  S(x,©1(©»),  ©2(0*  ))<  G(©*,a(©»)),  where  G(©,a(©*))  is  a 
constant  such  that  a(©*)is  the  probability ,  when  ©*  is  true,  that  this 
inequality  will  be  satisfied.  For  many  problems  the  functions 
©^(O),  ©2(0),  and  a(©)  can  be  chosen  so  that  the  generalized  score 
quasistatistio  v(x,©)  *  SU,©^©),©^©) )  -  G(0,a(©)),  0  c 
satisfies  the  conditions  of  Corollary  1  and  hence  defines  an 
admissible  estimator  ©*(x)  as  the  solution  0  of  the  equation 
v(x,0)  *  0f  If,  for  example,  Prob  (v(X,©)  =  0|©J  fe  0  f or  0  e  JT, 
and  the  set  |x[f(x,©)  >  0  J  is  independent  of  ©  e  J\ ,  then  each 
acceptance  region  |x[v(x,0)  s  oj  gives  a  best  test  which  is 
essentially  unique  (a.e.  PQ,  ©  e  yv),  and  henoe  admissible  for 
testing  H(0)  and  H(©-), 

Again,  as  ©2(0)  -0,(0)  — ►  0,  S(x,©1(©),  ©2(©))  — ►  S(x,©) 

*  ^  log  f(x,©), 

if  the  derivative  exists  at  eaoh  x,  for  each  ©  c  S\}  consider  as 
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above  the  (locallv-beat )  score  guaslstatlstlc 

v(x,©)  •  S(x,©)  -  G(©,a(©)).  Again,  if  thia  v(x,0)  aatiafiea  the 
conditions  of  Corollary  1,  then  an  admissible  estimator  0 *(x)  is 
defined  as  the  solution  ©  of  the  equation  v(x,0)  38  0.  It  is  well 
known  that,  under  a  mild  regularity  oondition,  an  aooeptanoe  region 
£x|v(x,©)  g  Oj  gives  a  looally-best  test  of  H(©)  and  of  H(0«»)j  under 
additional  mild  restrictions,  such  as  those  mentioned  above,  these 
tests  are  also  admissible.  The  case  G(©,a(©))  s  0,  0  eJl,  de¬ 
termines  (through  the  equation  S(x,d)  =  0)  the  maximum  likelihood 
estimator  ©(x),  which  is  thus  shown  to  be  admissible  (and  to  be 
looally-best,  i.e*  to  minimize  a(u,0)  for  ©  near  u,  among  all 
estimators  with  the  same  location  functions)  provided  that 
v(x,0)  *  S(x,©)  satisfies  the  conditions  mentioned.  Estimators  of 
this  form  were  proposed  by  Tukey  [10]  on  different  theoretical 
grounds  in  connection  with  the  methods  diseussed  in  Section  5  above. 

Estimators  defined  by  use  of  the  various  score  quasistatlstics 
mentioned  may  be  called  generalized  maximum  likelihood  estimators. 

If  S(x,0)  has  (or  may  have)  discontinuous  distributions,  it 
can  be  Replaced,  as  may  be  desired  at  least  for  some  theoretical 
purposes,  by  its  continuous  probability  integral  transform 

a(x,y,0)  =  y<>Prob  [S(X,0)  g  S(x,©)|;©J, 

+  (l-y)®Prob  (S(X,©)  <  S(x,©)|©3  , 

where  y  is  the  observed  value  of  ¥,  an  auxiliary  randomization 
variable  having,  for  each  ©,  the  same  uniform  density  on  0  g  y  <  1. 
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Than  for  eaoh  0,  a(Q)  may  be  prescribed  arbitrarily,  and  the 
statistics 


v(x,y,0,a(©) )  «  a(x,y,0)  •  a(©) 

has  a  continuous  distribution  and  takes  negative  values  with 
probability  a(©)#  In  suitable  problems,  with  suitable  choioes  of  aU 
the  quasistatistic  v  so  defined  will  satisfy  the  conditions  of 
Corollary  1*  The  same  treatment  can  be  applied  to  the  form 
S(x,©^(©),©2(©))#  To  avoid  technicalities  of  little  intrinsic 
interest,  we  disouss  the  case  in  whioh  such  randomization  is  not 
used# 

If  Prob  ^v(x,©)  *  0j©^  =  0  for  eaoh  ©  e  -fL,  then  each  suoh 
estimator  has  the  location  functions  a(©-,0)  5  1  -a(©+,©)  %  a(©)# 

If  a(0)  s  a,  a  constant,  suoh  an  estimator  is  a  confidence  limit; 
if  a( 0)  s  l/2,  suoh  an  estimator  is  a  median-unbiased  point  esti¬ 
mator#  In  the  important  case  that  X  =  (Y^,###Yn),  a  sample  of  in- 

n 

dependent  observations  Y^,  we  have  S(X,Q)  =  S  (Y^,0);  the  normal 
approximation  (based  on  the  Central  Limit  Theorem) 

a(o -,©,$)  •  Prob  {s(X,©)  <  0|©}*  J(0)  *  1/2 

(using  that  E(S(X,fi)  |©)  5  0)  is  often  close;  hence  in  suoh  cases 
the  maximum  likelihood  estimator  3(x)  is  approximately  median- 
unbiased#  If  S (X,©)  has  a  symmetrical  distribution  under  0,  then 
clearly  $  is  exactly  median-unbiased# 

In  some  oases,  as  illustrated  below,  a  family  of  soore  quasi- 
statistics,  e#g# 
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v(x,©,a)  ■  S(x,©)  -  0(©,a),  0  5  a  $  1, 

or 

v(x,«,o)  -  3(x,81t«),«2{8))  -  0(8,o),  Oj.il, 

can  be  used  to  determine  admissible  confidence  curve  estimators 
e(x,a),  0  |  a  |  1,  as  solutions  of  equations  v(x,0,a)  =  0. 

Estimators  based  on  score  quasistatistics  have  direct  useful¬ 
ness,  which  is  enhanced  by  the  simplicity  of  their  theory  and  of  the 
praotioal  techniques  for  their  use*  In  addition  they  are  of  special 
theoretical  interest,  due  to  their  relations  to  the  asymptotic 
theory  and  techniques  of  maximum  likelihood  estimation;  they 
generalize  and  justify  these  techniques  in  an  exact  sense*  The 
following  considerations  lend  them  further  intrinsic  interest:  For 
any  given  problem  of  estimation  of  ©,  consider  the  olass  of  esti¬ 
mators  having  specified  location  functions  a( ©-,©),  a( ©+,©),  For 

each  ©  c  -fLand  each  u  ^  ©,  u  e  j\ let  a(u,©)  *  m &}  a(u,©,©*),  where 

© 

for  u  >  ©  the  minimum  is  taken  over  all  estimators  suoh  that 
a( ©+,©,©*)  =  a( ©+,©),  and  for  u  <  ©  the  minimum  is  taken  over  all 
estimators  such  that  a(©-,©,©’  )  =  a(©-,©).  Then  a(u,0)  is  the 
envelope  risk  curve  (i*e*  the  minimum  of  the  respective  ordinates  of 
risk  curves)  for  the  class  of  estimators  with  the  given  location 
functions*  For  each  (u,©),  it  is  possible  to  attain  a(u,©)  in  the 
following  sense:  if  u  >  ©,  the  relatively  trivial  estimator  which 
takes  the  value  ©  with  probability  1  -  a(©+,©)  when  ©  is  true,  and 
which  takes  the  value  u  otherwise,  and  which  minimizes  a(u,©,©*) 
subjeot  to  these  conditions,  is  equivalent  to  a  best  test  between 
the  simple  hypotheses  ©  and  u,  of  the  indicated  size;  suoh  a  test 
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ean  be  based  on  the  seore  statistie  S(x,u,0);  similar  remarks 
apply  to  the  oase  u  <  ©•  Eaoh  suoh  single  statistic  S(x,u,0)  oan 
be  eribedded,  as  an  element ,  in  a  score  quaslstatistio 
S(x,0 ^(Q),©2(©))  for  ®  e  -^4  it  «ay  or  may  not  be  possible  to  define 
by  use  of  this  quasistatistic  an  estimator  which  has  the  specified 
location  functions*  An  estimator  can  attain  a(u,©)  uniformly  in 
(u,©)  only  in  problems  having  the  special  structure  described  in 
Section  7  above,  for  which  uniformly  best  estimators  exist*  In 
other  problems,  some  estimators  defined  by  generalised  score 
quasistatistio  attain  &(u,©)  at  some  but  not  all  (u,©)*  In  all 
problems,  the  computation  of  a(u,©)  requires  calculations  of 
probabilities  of  events  defined  by  score  statistics  S(x,u,0);  and 
the  possibility  of  its  attainment  by  some  estimator  at  specified 
points  (u,0)  is  related  to  the  existence  of  suitable  soore 
quasistatistics* 

8*1  Large-sample  approximations* 

If  x  »  (y^,***yn)  is  a  sample  of  n  independent  identically 

distributed  observations  (non-identioal  distributions  can  be 

n 

discussed  similarly),  S{x,©1(0),©2(©))  »  S(y1,©1(©),©2(©))#  Let 

li(u,0)  =  E[S(Y1,©1(u) )  |©3  and  <r2(u,C)  =  Var  [S(Y1,©1(u)  ,©2(u) )  |©) 

exist  for  eaoh  ©,u  e  A*  We  allow  Q^O)  =  ©2(0)  *  ©  here,  taking 

S(X,©,0)  3  S(X,0)  in  this  case,  and  assume  that  ©^(0),©2(0)  are 

fixed,  while  n  may  vary,  in  the  present  disoussion* 

n 

In  the  special  case  vn(x,0)  «  >'  " "  S(y^,©),  which  determines 
the  maximum  likelihood  estimator  <^n(x)  as  the  solution  ©  of 
vn(x,0)  a  0,  we  have  by  Khintchine*s  Theorem  (even  ifcr2(u,©)»s  do 
not  exist)  that  ^  vn(X,u)  converges  in  probability  to  tUu,©)  when 
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0  is  true#  If  u*  <  ©  <  u"  implies  p.( u*,G)  <  p.(0,0)  *  0  <  y(u",G), 
then  Lira^  a(u,d,6n)  «  0  for  u  t  that  is,  $n  is  consistent# 

Returning  to  the  general  oase,  for  large  n  the  Central  Limit 
Theorem  gives  the  normal  approximation  to  the  distributions  of 


vn(X,u,a)  S(Y1,01(u),02(u))  -  Gn(u,a): 

c  ,  n  -  /G  (u,a)-nn(u,©h 

ft-ob  {vn(x,u,a)  s  0|«}  *  I  , 

and  for  u  »  0,  the  approximate  determination  of  Gn(©,a): 

fr  (o. a)  \  i 

. . —  |  9  or  G  (9,a)  &  » 

Vn<*Q,e)J  n 


whloh  in  the  preceding  formula  gives 


ft.ob£vn(X,u)  §  0|#}*3  (-Vn^§}  +$S$r1<“>)  • 


For  the  maximum  likelihood  estimator,  Gn  =  0,  corresponding  to 
a  *  ^  in  these  formulae#  Thus  the  risk  ourves  of  the  confidence 
limit  estimator  9*  *  ©n(x,a)  determined  by  vn(x,9,a)  »  0  are 
approximately 


a(u,o,On( 


t 


J(h(u,9,a,n)),  u  <  0, 
2(h(u,©,a,n)),  u  >  0,  0  <  a  <  1, 


h(u,0,a,n)  *  -  Vn  ♦  ^uj§}  T 1la>* 


where 
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Hare  the  sufficient  (and  necessary)  condition  for  consistency  of 
©n(x*a)*  for  a  fixed  a,  0  <  o  <  1*  is  again  that  u*  <  ©  <  u"  imply 
p.(u*,©)  <  0  <  n(u",©). 

The  verification  of  the  conditions  of  Corollary  1*  for  a  given 
v(x*0),  is  sometimes  difficult*  Large-sample  approximations  are  of 

i 

some  theoretical  and  practical  help  in  this  connection*  For  example 
for  a  locally  best  oonfidenoe  limit  estimator  ©(x* a),  where 
x  =*  (y^,»*«y  )  and  the  Y^‘8  are  independent  and  identically  dis¬ 
tributed*  we  have  as  above 

Gn(©,a)  *  */ne<©*©)  , 

and  we  take 

vn(x,0,a)  *  S(x,0)  -  y/n<r(«,0)2"1(a)  • 

If  S(x,0)  satisfies  the  conditions  of  Corollary  1  (i.e*  if  for  each 
x  the  maximum  likelihood  estimator  ©(x)  is  determined  as  the  root  © 
of  S(x,0)  *  0),  and  is  decreasing  in  ©  for  each  x*  then: 

(A)  If  <r{©*©)  is  oonstant  (this  is  the  case  in  some  examples  in  the 
following  Section*  but  not  in  most  examples),  then  vn(x,©*a)  is 
also  decreasing*  as  required  by  Corollary  1* 

(B)  If  cr(©,©)  is  decreasing  or  Increasing  at  ©  *  ©**  then  for  a 
fixed  x  and  a  sufficiently  near  0  or  1*  either 

-nc<©,©)5"1(a)  or  -  n^©,#)^*1^-*1)  5  no(©*©)3”,1(a) 

will  be  increasing  more  rapidly  than  S(x*©)  is  decreasing  at 
©  «  0**  so  that  vn(x,©*<x)  and  vn(x*©*l-a)  cannot  both  be  de¬ 
creasing  in  ©  at  ©i* 
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(C)  On  the  othar  hand,  for  any  fixed  a,  0  <  a  <  1,  slnoe 

vn(x,#,a>  -II  tSty-.O)  -2lS*21  3_1(a)l, 

"  1*1  Vn 

a  sufficient  condition  for  vn  to  be  decreasing  in  ©  is  that 

S(y,,0) 

Vn 

be  decreasing  in  ©,  for  all  values  of  y^«  Clearly  as  n 
increases,  this  condition  becomes  a  less  restrictive  one,  being 
in  general  satisfied  for  a  wider  range  of  values  of  at 
8.2  Local  approximations  for  locally  best  estimators* 

In  cases  where  there  exist  precise  estimators,  that  is 
estimators  whose  risk  curves  are  small  except  for  u  very  near  ©,  it 
Is  natural  to  center  attention  on  small  neighborhoods  of  the  possible 
true  values  ©,  and  to  consider  estimators  whose  risk  curves  are 
relatively  small  in  such  neighborhoods,  such  as  those  based  on  so ore 
quasistatistics  with  ©gl©)-©^©)  small  or  aero  for  all  ©•  If 
H?(u,0)  s  p.(u,0)  and<r»(u,©)  *:j^cr(u,©)  exist,  then 
h*(u,©,a,n)  *  h(u,©,a,n)  gives  the  Taylor  series  approximation 

h(u,©,a,n)  £  h(©,0,a,n)  +  h*(©,©,a,n)  (u  -  0) 

and  a  corresponding  alternative  form  of  the  above  approximation  to 
a(u,©,©nU,a)).  In  the  special  oase  of  looally-best  score  quasi- 
statistics,  sinoe  n(©,0)  *  0  and  nf(®,®)  *  o-(0,9),  we  find 
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h(u,©,a,n)  &  y/n $(©,©)(©  -  u)  ♦  r*(a)[l  +  C®  -  u)5  • 

In  the  first  term,  the  ooeffioient  \/n<r(©,©)  of  the  error  (0  -  u)  is 
yTTST,  where  I(©)  is  Fisher fs  "Information  in  X  at  0."  The  second 
term  is  zero  for  a  ■  ^  and  for  the  maximum  likelihood  estimator;  for 
other  estimators,  the  first  terra  dominates  the  seoond  as  n  increases. 
The  indicated  approximations  to  risk  curves  are 

a(u,©,©n)  *  a(u,©,©n( )  -  J(-  */n<J(©,©).  [u  -  ©|), 

and  for  a  ^  ^ 

a(u,©,©n(#,a))  -  f  J(-  +  3"1  ( a ) ( ©-u )  +1 3 ) ,  u  <C 

1^1  -  ^  (•••  same  argument  *oo),  u  >  0, 

* 

&  (more  roughly)  ]$(-  v/n<*t0,0)*  |u  -  ©|)« 

These  approximations  exhibit  the  approximate  normality  of  distribu¬ 
tion  of  these  estimators  for  large  n*  While  locally  best  estimators 
are  in  general  not  comparable  with  other  estimators  (e*g«  those  above 
with  0^0)  <  ©2^  for  a11  having  similar  location  functions 
except  in  problems  of  a  simple  structure,  the  designation 
"Information"  for  I(©)  is  clearly  appropriate  and  useful  for  cases  in 
which  so  much  precision  is  attainable  that  interest  is  practically 
restricted  to  very  small  |u  -  ©|,  in  whioh  case  an  appropriate  choice 
of  an  estimator  will  usually  be  one  whioh  is  locally  best  or  perhaps 
one  defined  as  above  with  ©2(©)*©^(©)  small  for  all  ©• 
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It  should  bo  noted  that  the  preceding  approximations  whieh 
utilize  a  Taylor  series  approximation  are  not  accompanied  by  bounds 
on  errors  of  approximations*  Even  in  oases  where  suoh  approximations 
are  very  close,  under  a  severly  nonlinear  transformation  of  the 
parameter  space  ($—■»»?  =  *{(Q)  with  ^(0)  differentiable  and  increasing 
suoh  approximations  can  become  very  inaccurate*  Hence  the  prlnoipal 
concrete  value  of  suoh  approximation  formulae  seems  to  be  that  they 
provide  convenient  quantitative  conjectures  which  are  more  or  less 
plausible  but  which  require  independent  confirmation  (or  disoon- 
firmation)  for  specific  problems  and  sample  sizes*  Similar  remarks 
apply  to  the  preceding  approximation  formulae  based  on  the  Central 
Limit  Theorem  only*  with  the  qualification  that  suoh  approximations 
could  be  termed  "less  asymptotic"  than  those  which  also  use  the 
Taylor  series  approximation,  in  the  sense  that  the  former  approxi¬ 
mations  are  unaffected  by  monotone  transformations  of  the  parameter 
spaoe,  and  their  use  can  be  accompanied  by  use  of  the  known  bounds 
on  errors  in  the  Central  Limit  Theorem  approximation* 

60.3  Remarks  on  asymptotic  efficiency  of  estimators* 

The  theory  of  the  asymptotic  efficiency  of  maximum  likelihood 
estimators  (of*  for  example  Cramer  [133 ,  pp*  500-504)  utilizes  a  cri¬ 
terion  of  asymptotic  efficiency  (l*o*  469-490)  which  is  restrictive 
in  that  it  applies  only  to  estimators  having  asymptotically  normal 
distributions  with  means  equal  to  the  parameter  estimated;  suoh 
estimators  are  clearly  asymptotically  median-unbiased  (probability 
of  underestimation  approaches  ^  as  n  increases)*  It  is  advantageous 
to  use  a  less  restrictive  criterion  of  asymptotic  efficiency,  one 
whioh  applies  to  all  (sequences  of)  estimators  which  are  asymptot- 
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ically  median-unbiased.  In  order  to  eiribraoe  confidence  limit 
estimation  as  well  as  point  estimation,  it  is  advantageous  to  define 
a  criterion  of  asymptotio  efficiency  whioh  oan  be  applied  to  any 
sequence  of  estimators  whose  probabilities  of  underestimation  (at 
eaoh  0)  converge  with  increasing  n  to  a  fixed  oonstant  a,  0  <  a  <  if 
any  such  sequenoe  may  be  termed  an  asvmptotloallv  valid  sequence  of 
confidence  limit  estimators  (of  specified  coefficient  a). 

Under  broad  conditions  (some  simple  ones  were  given  above) 
consistent  estimators  exist;  it  is  then  natural  to  define  asymptotic 
efficiency  of  estimators  in  terms  of  the  properties  of  risk  curves 
of  estimators  in  the  neighborhood  of  the  true  value  of  0:  an 
asymptotically  efficient  sequence  of  confidence  limit  estimators  may 
be  defined  informally  as  one  which  is  asymptotically  valid  and 
asymptotically  locally  best.  The  estimators  defined  above  and 
illustrated  in  the  following  section  based  upon  quasistatistios  of 
the  form  vn(xn,0,a)  =  S(xn,©)-  Gn(©, a) provide  examples  of  suoh 
estimators,  and  have  the  further  properties  of  being  exactly 
(non-assympt otic ally)  valid  and  locally-best  (and  typically  ad¬ 
missible).  Additional  examples  are  based  on  quasistatistios  of  the 
form  vn(xn*®#°)  *  S(xn,©i^n(0),©2^n(O) )  •  Gn(©,a)  where  as  n  increases 
®2,n^  "  °l,n^  decreases  to  zero  rapidly  enough  to  give  the 
asymptotically  locally-best  property;  suoh  estimators  have  the 
further  properties  of  exact  validity  and  admissibility,  and  the 
functions  ®^#n(®)  oan  be  chosen  so  that  for  any  finite  sample  size 
a  suitable  emphasis  is  given  to  avoiding  errors  exceeding  specified 
positive  magnitudes;  for  practical  applications,  suoh  estimators 
seem  preferable  in  principle  to  (exactly)  locally-best  estimators. 
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The  usual  asymptotic  theory  (l*o*)  is  free  of  the  important 
assumption  (b)  of  Corollary  1  above.  Prom  the  present  non- 
asymptotio  standpoint,  for  eaoh  0  the  aoceptanoe  region 
A (9)  *  £x|S(x,fi)  g  oj  represents  a  looally-best  one-sided  test,  and 
the  family  of  suoh  tests  oan  be  used  as  usual  to  define  a  confidence 
region  for  estimation  of  9,  namely  U(x)  »  ^©|x  e  A(©)0  e  n ^  in 
general  suoh  a  confidence  region  will  not  have  a  constant  confidence 
coefficient,  but  its  theory  and  interpretation  in  applications 
follow  usual  lines*  The  failure  of  assumption  (b)  corresponds  to 
the  failure  of  the  sets  A(0)  to  oonstitute  a  nondecreasing  sequence 
in  ©;  this  in  turn  corresponds  to  the  fact  that,  for  some  x,  the 
confidence  region  U(x)  will  fail  to  constitute  an  interval 
[©^(x),”®]  which  oan  be  described  by  a  lower  estimator  ©*(x)*  The 
theory  of  admissible  confidence  regions  not  necessarily  of  interval 
form,  and  their  interpretation  in  applications,  lie  outside  the 
scope  of  the  present  paper*  However,  from  the  present  standpoint  it 
may  be  observed  that  the  principal  role  of  the  regularity  assumptions 
in,  for  example,  Cramer  (l.c*)  is  to  guarantee  that  with  increasing 
n,  for  each  ©  the  probability  that  U(x)  will  bo  an  interval  (or 
equivalently  that  S(xn,©)  will  satisfy  the  assumptions  of 
Corollary  1)  approaches  unity:  More  precisely  with  increasing  n, 
for  eaoh  ©  the  probability  of  the  set  of  points  xn  on  which  S(xn,u) 
is  decreasing  in  u  (at  least  for  u  near  ©)  approaches  unity.  The 
key  step  of  the  derivation  from  this  standpoint  is  the  observation 


that  the  law  of  large  numbers  applies,  when  9  is  true,  to  the  sum 
S(Xn,u)  ■  V  StY^u),  each  term  of  whioh  has  (at  least  for  u 
near  0)  a  negative  expeoted  value  E[-2-2  log  f(Y1,u)|©]*  (Similar 
remarks  apply  to  use  of  generalised  soore  quasistatistios  whioh  fail 
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to  satisfy  oondition  (b)  of  Corollary  1*)  Dropping  ths  qualification 
"for  u  near  ©"  gives  that  the  probability  of  multiple  roots  of 
S(xn,0)  ■  0  approaches  zero  with  increasing  n*  Asyraptotio 
efficiency  properties  of  oonfidenoe  limits  and  intervals  defined  by 
use  of  quasistatistios  of  the  form  S(xn,0)  -  Qyi(0,a)  were  proved 
under  broad  regularity  conditions  by  Wald  [11*.]  o 

The  remarks  of  Lehmann  [l53»  on  the  limited  value  of  any 
exolusively-asyraptotic  theory  of  optimum  tests  apply  with  equal 
foroe  to  estimation  theory.  Asymptotioally  efficient  estimators 
may  approaoh  efficiency  at  arbitrarily  slow  rates  as  n  increases* 

Only  on  the  basis  of  an  auxiliary  non-asymptotic  investigation  of 
the  quantitative  and/or  qualit&tive  (optimality)  properties  of  an 
asymptotically  efficient  estimator  can  it  be  recommended  in  an 
application  with  a  specified  (finite)  sample  size* 

9*  Examples .  Examples  1-3  illustrate  that  the  formal  treatment  of 
Section  8  can  often  be  applied  conveniently  to  problems  admitting 
uniformly  best  estimators* 

Example  1*  Normal  mean*  Let  x  *  (y^***  yn)  be  a  sample  of  n 
independent  observations  from  a  normal  distribution  with  known 
variance,  say  O'2  *  1,  and  unknown  mean  0,  -  oo  <  0  <  oo*  Then 


Let 


f(x,0)  * 


(y<  -  or 


v(x,0)  .  G(©,a(©)), 


where  a(0)  is  a  given  function*  Then 


v(XjQ)  =  n(y  -  ©)  -  0(0, a (©))  =  ny  -  n©  -  \/n  l'1(a(©) )  , 


i  n  - 

where  y  «  ±  Y~~  and  j|(u)  is  the  standard  normal  c*d*f*  Then 

v(x,©)  clearly  satisfies  the  conditions  of  Corollary  1  if  a(0)  is 

such  that  0  +  —  5“1(a(©))  is  increasing  in  ©j  as  n  increases,  the 
s/n 

latter  condition  becomes  a  less  restrictive  one  on  a(©)$  it  is 
obviously  satisfied  if  a(0)  9  a,  0  g  a  g  1*  For  eaoh  suoh  function 
a( 0) ,  an  admissible  estimator  0*(x)  is  defined  as  the  solution  0 
of  v(x,©)  »  0,  that  is,  of 


»  +^:F1(“(9))  -  y  • 

\/n 

I 

Denoting  the  solution  by  Q(y),  this  gives  ©^(x)  =*  Q(y);  Q(y)  oan  be 

any  increasing  function  of  y  if  a(0)  is  suitably  chosen*  For 

2 

a(Q)  *  a,  this  becomes  (in  the  general  case  where  o*  is  any  positive 
number ) 


©*'(x)  =  0(x,a)  »  y  -  l-1(a)  , 

/n 

an  upper  confidence  limit  of  confidence  coefficient  1-a  (and/or  a 
lower  confidence  limit  of  coefficient  a)*  Eaoh  of  these  estimators 
is,  by  Lemma  2  above,  uniformly  best  among  all  estimators  with  the 
same  looation  functions  a(©  -,©)  s  l-a(©  +,©)  =  a(e),  Taking 
a(©)  s  ^  gives  $(x,  •5)"  *  ©(x)  55  y*  Since  this  estimator  is 
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Independent  of  the  value  assumed  for  o'2,  the  olassioal  (maximum 
likelihood  and  mean-unbiased)  estimator  y  is  uniformly  best  among 
all  median-unbiased  estimators  of  Q  even  if  (p-  is  not  known.  The 
same  property  dearly  holds  for  the  olassioal  least  squares 
estimators  of  linear  regression  theory  under  normality  assumptions. 

Example  2.  Normal  variance.  Let  x  =  (y1#...  y  )  be  a  sample 
of  n  independent  observations  from  a  normal  distribution  with  known 
mean,  say  =  0,  and  unknown  standard  deviation  0  »  <3~,  0  <  CT <  oo. 
Then 


f(x,0)  *  (2t re2)’ 


Let  v(x,o)  *  log  f (x,0)  -  G(  ©,a( Q) ) ,  where  a(0)  is  a  given 
function.  Then 


v(x,fi)  -  1)  -  G(<r,a(<<))  , 

2  1  2  5 

where  s  «  —  >  y^  is  the  usual  unbiased  estimator  of  For  a 

ns2 

given th®  Chi-Square  distribution  with  n  degrees  of 

freedom;  hence  G(<T,a(<r))  *  ^0^a(<r)  )-n),  where  'X2  a  is  the  lower 

a-point  of  the  Chi-square  distribution  with  n  degrees  of  freedom. 

2 

Thus  v(x,Q)  a^<rj).  If,  for  exas^le,  a(cr)  »  a,  then 

«*<*>  -  «U,a)  =  3 


which  is  a  uniformly  best  estimator,  by  Lemma  2  above. 


A  uniformly 
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best  median-unbiased  estimator  of  <ris  <r(x,*5)*  Similarly,  uni¬ 
formly  best  estimators  of  the  varlanee  c*2  are  given  by 

o-2(x,a)  =  »2n/^>0  . 

« 

When  n  is  not  small,  n/^n  ^  s  1,  and  o-(x,.5)  a  B  and  sr2(x,.5)  a  £2, 

*  2 

Thus  the  oommonly  used  point  estimators  s  and  s  can  be  justified 
on  the  grounds  that  they  are  uniformly  best  (among  estimators  with 
the  same  location  functions)  and  very  nearly  (except  when  n  is 
very  small)  median-unbiased.  Tables  of  the  Chi-square  distribution 

provide  the  constants  7^2  #cj,  which  can  be  used  in  place  of  n  in 

9  p 

standard  procedures  for  computing  s  or  s  ,  to  obtain  the  estimates 

cr(x,.5)  or  <y£{x,m5)  respectively.  Comparisons  of  these  and  other 

estimators  from  the  standpoint  of  median-bias,  with  tables,  were 

given  by  Eisenhart  and  Martin  [163.  For  the  more  usual  problem  in 

whioh  p  is  unknown,  with  N  *  n+1  observations,  the  same  remarks 

2  N 

apply  to  the  usual  mean-unbiased  estimator  s  =  (yA  -  y)/(N  -  1) 
and  to  8.  The  theory  of  such  multi-parameter  problems  lies  outside 
the  formal  scope  of  the  present  paper* 

Example  3.  Binomial  mean.  Let  x  *  where  the  Y^s 

are  independent,  Prob  (Y^  =  1 )  =  ©,  Prob  (Yj  *  0)  *  1  -  Q  , 

0  £  G  3  1*  Lot  Z  be  an  auxiliary  randomization  variable,  uniformly 

-  _  n 

distributed  on  0  §  Z  <  1,  Then  t  *  t(x,z)  *  ny  +  z,  where  ny  *  V  y. 

***  tit  * 

is  a  sufficient  statistic  having  the  monotone  likelihood  ratio 
property^  hence  eaoh  nondecreasing  function  ©**(t)  taking  values  in 
the  unit  interval  is  a  uniformly  best  estimator.  The  classical 
(maximum  likelihood,  unbiased)  estimator  is  0  *  [t)/n  *  y,  where  [t] 
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i*  the  largest  Integer  not  exoeeding  t«  By  use  of  binomial  tables* 

exaot  confidence  limits  9(t,a)  and  median  unbiased  estimators 

9(t,,5)  can  be  determined  easily  as  the  solutions  9  of  the 

equations  a  *  Prob  (T  §  tj9),  where  t  is  the  observed  value  of 

the  statistic,  For  typioal  purposes  of  informative  inference*  it 

seems  preferable  to  dispense  with  use  of  the  randomisation 

variable  z;  a  non-randomized  uniformly  best  point  estimator  having 

looation  functions  closest  to  3j,  in  a  oertain  sense,  is  defined 

for  each  observed  value  of  ny  as  the  solution  9  of  the  equation 
Prob  (X  <y|9)  »  Prob  (7  >  yfo);  this  estimator  TS^y)  is  easily 

determined  by  use  of  binomial  tables;  when  n  is  not  small,  we  have 

*(y)  l  y  .  In  all  cases  the  effect  of  the  randomization  variable 

is  minor  except  when  n  is  small#  Thus  the  classical  mean-unbiased 

estimator  can  be  justified  on  the  grounds  that  it  is  uniformly 

best  (among  estimators  with  the  same  location  functions)  and  is 

very  nearly  (except  when  n9  or  n(l-9)  is  very  small)  median-unbiased 

Other  discrete  examples  with  the  m,l,r.  property,  suoh  as  the 

Poisson  and  negative  binomial,  may  be  treated  similarly. 

Example  (u  Loglstlo  mean.  Let  x  «  (ylf**#y  )  be  a  sample  of 

n  independent  observations  from  a  logistic  distribution  with  unknown 

mean  9s  Prob  (Y  s  y|9)  *  Tjr(y  -  9)  ■  (1  +  e”^"®*)”1,  -oo  <  y  <  oo, 

-  oo  <  9  <  oo;  Y  has  the  density  function 

♦(y»9)  s  e“(y-0 J/d+e-ly-®)  )2,  .  oo  <  y  <  oo  , 

For  any  fixed  A  >  taking  9^9)  =  «  -  A*  -  9  +  A# 

determines  a  soore  quasistatistic 


SU,©-A*0+A>  =  ^ 


(log  -  log  ♦( y 


r4+A)J . 


Fop  any  fixed  a,  0  g  a  g  1,  taking  a(0)  =  a  determines  a  soore 
quasistatistic 

v(x,©,a)  =  S (x,Q-A,Q+/\)  -  G(0,a) 

whioh  satisfies  the  conditions  of  Corollary  1  of  Section  6  above, 
and  hence  determines  an  admissible  confidence  limit  estimator 
Q  *  ©(x,a)  as  the  solution  ©  of  the  equation  v(x,©,a)  »  0.  Since 
9  is  translation  parameter,  G(©,a)  is  independent  of  0,  and  may  be 
written  G(a),  By  symmetry,  G(,fj>)  =  0,  G(a)  can  be  determined 
approximately,  except  for  a  very  near  0  or  1  and  for  very  small  n, 
by  use  of  the  Central  Limit  Theorem;  let  y.(u,0)  and  cr^u,©)  denote 
respectively  the  mean  and  variance  of  S(Y,u-^,u+^)  when  ©  is 
true;  then  p.(©,©)  =  0  by  symmetry;  we  may  write  p(u-O)  and 
<T^(u-©)  because  ©  is  a  translation  parameter.  We  have 

Prob  (v(X,u,a)  §  0|©1  «  J 

'  '  V  \/ncr(vL-9)  J 

which  provides  an  approximation  to  the  risk  curves  a(u,©,©*)  of  the 
estimator  ©*'^  *  ©(x,a)j  for  the  determination  of  G(a),  similarly 

ft*ob  £v(X,©,a)  g  0|©J.s  a  =  ]|5(G(a)/ v'n j*(0)^  or  G(a)  * \Z5drX 0 ( o). 


This,  with  the  formula  above  gives  the  approximate  risk  curves  of  ©’’ 
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a(u,9,«w)  » 


i<  -  +£$Tr1<«»  f» « < «. 


X  -  J(***same  argument* •  •)  for  u  >  ©• 


The  preoeding  disoussion  depended  throughout  on  the  chosen 
value  /^  >  0*  A  locally  best  oonfidenoe  limit  estimator 
Q  =  9(x,a)  Is  determined  as  the  solution  9  of  the  equation 


v(x,9,a)  =  S(X,9)  -  G( a)  *  0. 


Here  S(y,9)  *  log  f(y-9)  =  2y(y-9)  -  l^Y-O) 

true,  a  uniform  distribution  on  the  unit  interval; 

n 

is  true  the  c*d#f*  of  V  'P'tY,-©)  (  and  hence  that 


be  calculated  as  in  Cramer  [133,  pp*  244-246*  The 
approximation  gives  (since 


has,  when  9  is 
hence  when  0 
of  S(X,0) )  can 
normal 


cr2(0)  *  Var[ S(Y,9) |9]  «  Var  [S(X,9)|9]  *  |),  G(a)  »  /fr1  (a); 


a  »  ^  gives  exactly  G(^)  *  0  and  determines  the  maximum  likelihood 
estimator  £  =  9(x,*5)*  In  general,  a  locally  best  confidence  limit 
estimator  9{x,a)  is  determined  (approximately,  except  for  a  *  -j)  as 
the  root  9  of  the  equation  S(x,9)  *  or 


Such  an  equation  is  easily  solved  numerically  by  use  of  Berkson's 
tables  of  ^r(u)  ([173)* 


The  present  example  serves  also  to  illustrate  the  deter¬ 
mination  of  an  admissible  confidence  ourve  estimator  by  use  of  a 
family  of  quasistatistios  as  described  at  the  end  of  Seotion  6 
above*  Eaoh  of  the  families  of  quasistatistios  v(x,0,a),  0  a  <  1 
considered  here  (each  based  upon  a  fixed  >  0)  has  the  property 
that  ©(x,a)  is,  for  eaoh  fixed  x,  decreasing  in  a;  in  faot,  for 
eaoh  x,  0(x,a)  decreases  continuously  from  oo  to  -oo  as  a 
increases  from  0  to  1#  Thus  for  each  observed  x,  eaoh  0 
(  -oo  |  U  |  oo)  will  be  a  oonfidenoe  limit  ©(x,a)  for  some  <x$  we 
can  conveniently  determine  the  required  solutions  0(x,a)  of 
v(x,©,a)  =  0  in  the  form 

a(x,©)  *  Prob  -[s(X,©)  s  S(x,©)  |©  j  *  J(  S(x,©) ) 

for  as  many  values  of  ©  as  desired* 

Numerical  example*  Let  x  =  (y-j^yg^y-j)  B  (0*0,6).  Letting  ©£ 
denote  a  trial  value  of  ©,  =  SCx,©^),  and  =  cUx,©.^), 

Prob  £s(X,©^)  g  S(x,©i)j©1|  ,  i  =  1,2,,,.,  and  taking  ©1  =  y  *  2 
as  a  trial  value  plausibly  near  0(x,*5)  *  we  obtain 

S0  *  2  ^^(yi"2)  ”  3  a  ~°«^9,  *1 ' 2<-^59)  *  .288. 

Further  similar  computations  are  summarized  in  Table  1  and  in  Fig* 

•  ’  4 

a  sketch  of  the  confidence  ourve  o(©,x)  *  min  [a(x,©),  l-a(x,©)3* 
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Table  I 


1 

. . .!* . „ 

si  . 

,approx-  “1 

exaot 

1 

2.0 

-0.559 

* 

.288 

2 

1.44 

-0.256 

.399 

3 

1.18 

-0.758 

.470 

4 

1.12 

-0.031 

.488 

5 

1.08 

-0.0005 

.4998 

.4998 

6 

3.08 

-0.927 

.177 

7 

4.o 

-1.166 

.122 

8 

5.o 

-1.511 

.065 

9 

6.0 

-2.0 

.023 

10 

7.0 

-2.462 

.007 

11 

-1.0 

1.924 

.973 

12 

-2.0 

2.523 

.994 

.998 

13 

0.0 

1.0 

.841 

.633 

Figure  1 
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The  oloseness  of  the  normal  approximations  can  be  oheoked  in 
the  present  oase  by  use  of  the  exaot  formula  (based  on  Cramer,  l.o. 

{f 3  °S*§1» 

|3  -  |(z~l)3  ,  1  §  z  g  a  , 

1  -  §(3-z)3  ,  2  g  z  g  3  , 


where  z  =  z(x,©)  *  ^(S(x,G)  +  3)*  The  approximation  is  seen  to  be 
quite  adequate  here*  In  other  examples,  if  exaot  values  of 
a(x,©)  oannot  be  obtained  by  use  of  standard  tables  or  tractable 
integrals,  one  may  consider  checking  approximate  values  of 
a(x,©),  for  a  few  values  of  ©  of  particular  interest,  by  use  of 
(a)  the  error-bound  on  the  normal  approximation,  (b)  numerical 
integration,  (c)  empirical  sampling  (Monte  Carlo),  or  possibly 
(d)  an  asymptotic  expansion*  For  (a)  and  (d),  see  Wallace,  [18]* 
The  values  ©^  above,  for  i  =  2,***£,  were  determined  by 
©^+^  =  ©^  +  s^,  based  on  Fisher *s  formula 

0^+1  =  ©^  +  S(x,©^)/Var  (S(X,9i)|©^]  for  iterative  calculation  of 
maximum  likelihood  estimates*  If  log  f(x,©)  «  a©2  +  b©  +  o  for 
some  constants  a  <  0,b,o,  at  least  for  ©  near  fyx)  (asymptotio 
theory  shows  that  this  will  be  the  case  with  high  probability  for 


sufficiently  large  n,  under  certain  regularity  conditions),  then 
S(x,©)  =  2a©  +  b,Jg  S(x,©)  •  2a;  (a©2  +  b©  +  o)  is  minimized  by 
©*  *  -b/2a  =  ©  -  S(x,©)/  ^  S(x,©).  ^S(x,©)  may  be  calculated 


directly;  or  approximated  numerically  from  difference  quotients 
—-*-5— based  on  previously  calculated  ©^s;  or  (as  done  above) 


5o 


“estimated"  by  its  expeoted  values  for  sufficiently  large  n,  with 
high  probability  the  approximation 

—jgS(x,Q)  «  E[^5S(X,©)  |'0]  *  e[<!«2  log  f(X,Q)|©] 

«  -Var  [S(X,0)  [©]  -  -  1(0) 

/\ 

is  effectively  close#  The  rate  of  convergence  of  ^  to  0  may  be 
slow  as  above,  for  samples  with  "improbable  configurations"  and/or 
small  n;  use  of  JfcgS(x,©)  rather  than  its  expeoted  value  here  would 
evidently  give  faster  convergence,  but  would  require  additional 
calculations  for  eaoh  i#  Speed  of  convergence  is  not  of  exclusive 
interest  here;  since  a  number  of  values  of  =  a(x,©^)  are 
desired  for  a  sketch  of  the  confidence  curve  estimate,  any  oonvenie 
method  of  choosing  successive  O^’s  may  be  used# 

The  values  ©^  and  ©^  were  chosen  as  trial  approxi¬ 

mations  to  the  confidence  limits  0(x,#025),  ©(x,#975)  respectively, 
by  use  of  the  asymptotic  formula  for  such  confidence  limits: 

3  i  F1(#975)/Var  [S(X,©)|$]  *  ©  t  2  . 

The  poor  approximations  obtained  provide  a  limited  illustration 
of  the  fact  that  suoh  approximations  are  "more  asymptotic,"  i#e# 
may  be  expeoted  to  be  often  less  close,  than  the  normal 
approximations  to  distributions  of  soore  statistics# 

Example  5#  Laplaoean  mean#  Let  x  ■  (y1,«##y  )  be  a  sample 
of  n  independent  observations  from  a  Laplaoean  (double  exponential) 
distribution  with  unknown  mean  0,  -oo  <  ©  <  oo,  with  density 


funotion 
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h(y,0)  *  ^  -  oo  <  y  <  oo  • 

For  any  fixed  ^  >  0,  let  v(x,G,a)  =  S(x,G  -  0  +  -  0(0, a) 

85  (^yi  ‘  ®  "  Al  -  ly±  -  ©  +  A1)J  -  °(a) 

We  note  that 

f2A  if  #  s  y  -  A 

ly  -  «  -  Al  -  ly  -  *  +  Al  °V<y  -  «)  ify-As®§y+A 

\-2A  if  y  +  A  §  e, 

and  hence 

*2Ap  < ly±  -  «  -Af  -  ly ±  -  ®  +  Al!)  s  aAn  for  all  x. 

Sinoe  Prob-j^Y  g  0  -  AP°J  55  2  e"^  ,  the  c.d.f*  of  ' 

n  A 

fY^  -  0  -  /ll  -  |YA  -  0  +  2^1)  has  a  jump  of  e“£l)n  at  eaoh 

end  of  its  range,  and  is  continuously  increasing  between  these 

jumps*  Hence  G(a)  is  well-defined  if  (|e“^)n  <  a  <  |  -  (3je~A)n. 

for  other  a*s  use  of  an  auxiliary  randomization  variable  would  be 

necessary!  by  symmetry,  G(-|)  =  0*  A  simple  computation  gives 

Var  ( |Y  -  0  -  ~  |Y  -  0  +  2\| )  ®  8(1  -  e"^  -  Ae"1^) ,  *  v, 

say|  for  n  not  very  small  and  a  not  extreme,  the  normal  approxi¬ 
mation  to  the  distribution  of  v(X,0,a)  gives 


0(a)  *  /nv  J"*(a) 
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For  any  a  bounds d  as  above,  by  Corollary  1  the  estimator  ©(x,ci)j 
defined  as  the  solution  ©  of  v(x,©,a )  =  0,  is  admissible. 

The  median-unbiased  estimator  ©(x)  *  0(x,.5)  defined  as  the 
solution  ©  of 

XI  1(7!  -  A)  -  «l  1(7!  +  A>  -  ®l 

(which  is  easily  solved  numerically),  depends  upon  the  particular 
value  A  chosen^  the  error-probabilities  a(©  -  A*®*®' ), 
a(©  +  /^,©,6'f)  have  a  minimized  oommon  value  for  all  ©. 

Loo  ally -best  estimators  ("A — *  °")  ©U,a)  are  defined  by 
use  of 


v(x,©,a) 


G(a)  , 


where,  for  any  relation  R,  the  indioator-function  1(H)  is  defined 

by  I(R)  *  1  if  R  is  true  and  I(R)  =  0  if  R  is  false.  Thus 
n  n 

r-  Ky^,  >  ©)  -  Y~  I(j±  <  ©)  is  the  number  of  observations  Ji 


exceeding  ©  minus  the  number  of  observations. less  than  ©$  with 
probability  one,  the  observations  y^  have  n  distinct  values,  and 
may  be  ordered,  y^j  <  y^  <  ...  <  7(n)*  Thfin 


S3 


Yp  >»)-7p  Hi  I  <  #> 


Let  r  be  any  integer,  1  s  r  §  nt  It  is  easily  seen  that  for 


a  s  1  -  ~r  fill  (”),  G(a)  =  n  +  1  -  2r; 

2n  v&i  r 

hence 

>  ©)  -  I(yA  <  0)  -  (n  +  1  -  2r)  • 

Wi$h probability  one,  v(x,©,a)  «  0  will  have  a  unique  solution, 
namely  ©(x,a)  =  Y(ry  Since  G(0)  =  -n  and  G(l)  =  n,  0(x,l)  s  -oo 
and  0(x,0)  -f  oo.  For  any  observed  x,  the  set  of  (n+2)  confidence 
limits 


[  ©(x,l )  ,©(x,l-(*>j)  ) , • . ©(x,  (^)^ ) ,0(x, 0 ) ]  =  t •  oo,y ^  j ,y ^ 2 j •  *y ^ ^ j i  ool 

serves  as  a  (loc ally-best )  confidence  curve  estimate.  (For  other 
values  of  a,  use  of  an  auxiliary  randomization  variable  would  be 
required  in  defining  v(x,©,a).)  In  contrast  to  the  approximate 
confidence  limits  given  by  asymptotic  methods,  the  various  exact 
confidence  limits  here  depend  on  all  values  y^  in  the  sample  x  and 
not  only  on  the  value  of  §  «  y((n+i)/2)*  fche  sample  median  (for  n 
odd)* 
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For  the  more  general  problem  of  estimating  the  median  0  of  a 
Laplaoean  density  function 

h(y,G,o)  *  ^  .00  <  y  <  oo  , 


with  known  scale  parameter  c  >  0,  similar  derivations  give  the 
same  looally  best  confidence  limits  and  confidence  curve  esti¬ 
mators*  Since  these  estimators  are  independent  of  c,  they  oan  be 
used  for  estimation  of  0  in  the  more  general  problem  in  which  c 
is  unknown.  For  the  latter  problem,  they  remain  valid  and  looally 
best  (with  respect  to  errors  in  estimation  of  0,  uniformly  in  c), 
and  their  risk  curves  respectively  depend  on  the  argument  (u~©)/c* 
Still  more  generally,  let  the  Y^*s  be  independent  with  any 
continuous  c*d*f.  of  unknown  form,  with  unknown  median  ©•  Since 
the  estimators  of  ©  given  above  remain  valid  (have  the  given 
location  functions),  and  are  essentially  unique  locally-best 
estimators  with  the  given  location  functions  in  the  special  case 
of  Laplaoean  distributions,  these  estimators  may  be  called 
admissible  for  the  non-parametric  problem  of  estimation  of  a  median 
of  a  (continuous)  distribution  of  unknown  form*  Similar  remarks 
apply  to  such  use  of  order  statistics  y^j  as  estimators  of  the 
p-quantile  of  a  continuous  distribution  of  unknown  form;  here  the 
generalized  Laplaoean  density  function 


h(y,©) 


y  <  «  » 
y  i  8  > 


for  which  9  is  tho  p-quantile,  replaoes  the  Laplaoean  density, 
for  any  specified  p,  0  <  p  <  1,  and  the  derivation  prooeeds  in 
essentially  the  same  way  as  above  where  p  ■  ^  • 

Example  6.  Quant al  response  models#  Let  x  *  )# 

where  the  Yi 's  are  independent, 

Prob  ^Y±  «  l\9^  *  P1(0),  Prob^Y1  *  0\oj  -  Q1(0)  «  1  -  P±(0), 

i  “  1 , • • #n, 

where  the  P,(0)*s  are  known  increasing  functions  of  0,  having 

derivatives  p!(0),  Q  e  a  “  (0^),  an  open  interval#  Examples  in- 

1  “  -d.O 

elude:  (1)  Dilution  series  [19] :  Pj_(0)  ®  1  -  e  1  ,  where  dA  is  a 

known  "dose"  (volume)  of  material  examined  in  the  ith  observation, 
and  0  is  the  unknown  mean  concentration  of  minute  particles  per  unit 
volume  randomly  distributed  in  the  material#  (2)  Mental  ability 
tests,  normal  model  [20]:  P^Q)  *  (l/k^)  +  ( (kj-D/k^jUj.+bj©)  is 
the  probability  that  a  subject  with  unknown  ability-parameter  0  will 
respond  correctly  to  the  i^*1  item  in  a  test#  Here  ]$  is  the  standard 
normal  o#d#f*,  and  the  parameters  0  <  k^  g  oo,  -oo  <  a^  <  oo,  and 
b^  >  0  which  characterize  the  i  item  may  be  assumed  known  (or 
estimated  with  high  precision)  on  the  basis  of  previous  Investigation 
aA  represents  the  item's  level  of  difficulty,  bA  its  sensitivity,  and 
(l/kj)  if  positive  may  be  interpreted  as  the  probability  of  a  oorreot 
response  due  to  guessing  only#  (3)  Mental  ability  test,  logistic 
model  [21]  s  As  in  (2),  with  J(u)  replaoed  by  the  logistio  c#d#f# 
“Vjr((l#7)u)  *  1/(1  +  #(-l*7)U)#  This  very  slight  quantitative  modi¬ 
fication  gives  a  model  which  is  equally  plausible  and  has  much 
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greater  mathematical  tractability;  in  the  oaee  where  l/k^  *  0,  it 
provides  a  sufficient  statistic  with  the  monotone  likelihood  ratio 
property*  (If)  One -parameter  bioassay  model,  normal  form  £  223  s 
p^©)  *  (l/k)  ♦  ((k-l)/k)]|(©  +  bd^J*  Here  ©  is  the  unknown  con¬ 
centration  of  a  component  in  material  being  assayed;  the  oase 
l/k  *  0  is  most  common;  d^  is  a  known  dose  parameter;  b  is  a 
sensitivity  parameter  which  in  speoial  oases  may  be  known  or 
estimated  with  relatively  high  preoision*  (5)  One -parameter 
bioassay  model,  logistic  form  [233  s  As  in  (If),  with  jj  replaced  by 
t-*  In  the  usual  case  l/k  *  0,  with  b  known  this  model  provides  a 
sufficient  statistic  with  the  monotone  likelihood  ratio  property* 

We  have 


P^©)/?^©) 


for 


s  1* 


s(y±,o)  = 


QjWAijtO)  *  -Pi(«)/(1-Pi(«))  for  =  0, 


or 


s(y1#0)  *  P^(a)/Q1(©)  +y1p’(e)/P1(«Ki(fi), 


s  0  or  lj 


and 


Ep  (q)  Q,  (Q)‘ 

FTuT  "  STul 

,  i  i  „ 


trf( u,0)  «  Var  [S(Y1,u)[0]  *  P*(u)2tP1(©)Q1(0)/P1(u)2Q1(u)23  , 


^(©,3)  *  0,o**(©,©)  -  Pj[(©)2/P1(©)Q1(©)  . 


$7 


The  normal  approximation  gives 

Prob  [S(X,u)  §  k|©]  *  J 

For  a  given  (u,0),  this  approximation  is  close  provided  that  (a)  the 

right  member  is  not  very  near  0  nor  1,  and  (b)  the  number  m  of 
2  2 

crj(u,©)*s  near  max^Jtu,©)  in  value  is  not  small. 

If  for  each  i  and  S(y1,0)  is  decreasing  in  ©  (i.e., 

pi(©)pJ(©)  <  P{(©)2  and  Q1(0)Pj(©)  <  p{(©)2),  then  v(x,©)  =  S(x,©) 
satisfies  the  conditions  of  Corollary  1,  and  the  maximum  likelihood 
estimator  $(x),  the  solution  of  S(x,©)  =  0,  is  admissible;  if  the 
normal  approximation  above  (with  u  «  ©)  is  close  for  respective 
values  of  0,  ©  is  approximately  median-unbiased;  if  the  approxi¬ 
mation  is  close  for  respective  values  of  (u,0),  ©  has  the 
approximate  risk  curves 

r  K-IZ^ifu.oiA  H  cr-ftu,#))1/2),  u  < « , 

alu.O.'S)  S<  ^  1 

x-3(...  same  argument  •  •  . ) ,  u  >  ©  . 

More  generally,  to  determine  locally  best  (approximate) 
oonfidenoe  limits  0(x,a)  as  solutions  ©  of 

v(x,©,a)  =  S(x,©)  -  (  di<r^(©,©))1/^|"1(a)  *  0, 

a  simple  adaptation  of  the  disoussion  at  the  end  of  Seotion  8.1 
above  may  be  applied  to  the  problem  of  verifloation  of  the 
conditions  of  Corollary  1« 
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Example  7.  Rectangular  mean  Let  x  *  (y1#**«  7n)  bo  a  ••■Pi* 
of  n  Independent  observations  on  a  random  variable  Y  with  density 


(lifO-|^y^0  +  | 


h(y,0) 


0  otherwise, 


with  0  *  E(Y)  unknown.  Let  r  and  s  denote  respectively  the  smallest 
and  the  largest  of  the  observed  values  y^»  Let  Qv  »  0*(r,s)  be  any 
function,  defined  for  all  r,s  such  that  r  §  s  «|  r  +  1,  which 
satisfies  s  -  «|  ^(r.s)  §  r  +  ^  and  which  is  nondeoreasing  in  r 

and  in  s.  Then  ®v(r,s)  satisfies  the  conditions  of  Lemma  1  since, 
for  each  ©Q,  £x|©*'  g  and  |x|9*  <  satisfy  the  (necessary  and) 
sufficient  condition  given  by  Pratt  [21*]  for  admissibility  of 
one-sided  tests  on  9.  Venketeraman  [2$]  has  shown  that  such 
estimators  constitute  an  essentially  complete  class,  and  has  given 
minimal  complete  and  minimal  essentially  complete  classes  of 
estimators  of  0. 

For  samples  of  size  n  *  2,  each  of  the  following  estimators  is 
admissible  and  median-unbiased: 


9*(x) 
9*  (x) 


N  .  . 

9^(x) 


(r  +  s)/2,  the  usual  mean-unbiased  estimator. 


I 

$ 


,r  ♦  (  /Z  -  l)/2, 


* 

l  •  (  /2  «*  l)/2. 


if  s  >  r  ♦  1/  /Z.  , 
if  s  g  r  +  1/  /2  , 

if  r  «  i  *  1/  /2  , 
if  r  *  s  -  1/  /2  . 
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Among  median-unbiased  admissible  estimators,  9*  is  uniformly  best 
with  respeot  to  errors  of  under.estimation,  and  ©"  is  uniformly 
best  with  respeot  to  errors  of  over-estimation.  Analogous  oonfi- 
denoe  curve  estimators  are  easily  oonstruoted. 

For  any  fixed  k,  0  |  k  |  for  testing  hypotheses  of  the  form 
H(©  >:  0  %  ®0  or  ®  <  9q,  there  is  an  admissible  acceptance 

region 


A(e0)  =  g  #0  +  k,  aS«0+ij 

and  another  admissible  acceptance  region 

A,(0o)  S  ®0  “  *  3  ®0  “  1}  * 


From  such  tests  we  obtain  admissible  confidence  limit  estimators  at 
each  level,  and  the  corresponding  admissible  confidence  curve 
estimator: 


c(©,x) 


0,  ifO>r+|orQss-|, 

i  -  I®  - 


nt.hAnwl  so 


If  x  »  (0*9, 1*1)  *  (r,s),  or  alternatively  if  x  «  (0. 6,1*4)  =  (r,s), 
we  obtain  respective  confidence  curve  estimates  which  reflect  that 
the  "amount  of  information  in  a  sample"  increases  with  (s-r): 


c(0,x) 


0  .$  1,0  1.5 


© 
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Alternatively,  for  any  fixed  k,  •  |  j  k  5  there  la  for 
eaoh  H(©0)  and  H(0Q-)  an  admissible  aooeptanoe  region 

A(o0)  *^*1^  -  k)r  +  (^  +  k)  s  §  ®0  +  k  j  * 


From  such  tests  we  obtain  admissible  confidence  limit  estimators 
at  each  confidence  level,  and  the  corresponding  admissible  con¬ 
fidence  curve  estimator: 


c(0,x) 


iffl|r+|or8§s- 
-  ^r!f s'fr >  otherwise 


1 


2 


For  the  two  samples  considered  above,  we  obtain  the  respective 
confidence  curve  estimates  1 


o(e,x) 


i  '  1 

•5  1.0  1.5 


< -  ©  - * 


Since  the  last  curve  lies  under  that  given  by  the  first  estimator 
for  the  same  sample,  it  provides  stronger  inferences  about  0*  This 
is  not  inconsistent  with  the  admissibility  of  the  first  estimator, 
which  provides  (at  most  confidence  levels)  stronger  Inferences 
(shorter  confidence  intervals)  from  relatively  uninformative 
samples  like  the  first  sample* 
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Example  8.  Cauohv  mad lan.  Let  Y  have  the  Cauchy  denaity 

funetion  h(y,©)  ■  1  1  1  ■  y  -oo  <  y  <  oo,  -oo  <  9  <  oo.  Then 

ir(l+(y-©r) 

S(y,©)  ■  ■  Taking  v(x,©)  «  S(y,0),  the  conditions  of 

l+(y-©r 

Corollary  1  are  satisfied,  and  v(x,©)  =  0  defines  the  median- 

unbiased  locally -best  estimator  ©*(y)  ■  y.  However  for  a  t 

0  <  a  <  1,  the  conditions  of  Corollary  1  are  not  satisfied  by 

v(x,Q)  *  S(y,©)  -  G(a).  For  x  *  (y1#y2),  even  for  a  ■ 

2 

v(x,0)  *  S(x, 9)  «  ^  S(y^,©)  fails  to  satisfy  the  conditions  of 

Corollary  1.  (For  fy2  -  y^|  large,  S(x,©)  *  0  has  three  roots  ©• ) 
Thus  in  general  there  do  not  exist  confidence  limit  estimators 
(nor  median-unbiased  estimators)  which  are  locally-best  uniformly 

i£ Li* 

Introduction  to  general  theory  of  admissible  estimators. 

To  illustrate  the  general  theory  of  admissible  estimators,  and 
the  place  of  the  methods  introduced  above  within  the  general  theory, 
we  consider  the  oase  in  which  jxis  finite:  YV  »  -^©|©  *  l,2p...k^. 
The  principal  features  of  the  general  oase  (in  which  .rvis  any 
subset  of  the  real  line)  can  be  illustrated  conveniently  in  this 
case,  for  which  the  complete  theory  can  be  developed  by  relatively 
elementary  methods.  For  any  such  estimation  problem,  we  have  a 
specified  family  of  density  functions  fix, 9),  9  «  l,...,k.  For 
eaoh  estimator  ©v(x),  let 


b(u,©,0 *) 


Prob  [©*(X)  *  u|©),  if  u/9  , 


if  u  *  9  , 


6a 


for  u,9  ■  The  risk  curves  of  9*  are 


fc  b(j,e,#*). 


if  u  <  0, 
if  u  *  9, 


jjP  b(  ®  )  * 


if  u  >  9  * 


It  is  useful  to  interpret  suoh  an  estimation  problem  in 
relation  to  a  somewhat  different  statistioal  inference  or  decision 
problem,  which  for  brevity  we  shall  oall  the  multideolsion  problem: 
This  other  problem  is  that  of  ehoosing,  on  the  basis  of  an  observed 
value  x,  one  of  k  specified  simple  hypotheses;  it  may  also  be 
described  as  an  estimation  problem  which  lacks  a  parametric  struc¬ 
ture  in  the  sense  that  no  ordering  of  the  labels  9  of  the  k 
hypotheses  is  relevant  to  the  problem*  Any  measurable  function 
9^(x)  taking  only  the  values  l,***k,  represents  both  a  possible 
solution  to  the  multidecision  problem  (a  decision  function,  or  an 
inference  function,  or  an  "estimator"  in  the  last-mentioned  sense) 
and  an  estimator  in  the  sense  discussed  above* 

For  the  multidecision  problem,  the  merits  of  each  deoision 
function  9v(x)  are  represented  completely  by  its  error-probabilities 
b(j,9,Q^};  for  eaoh  9,  such  probabilities  are  the  components  of  the 
vector-valued  risk  function  of  9*"  at  9*  The  general  goal  is  to 
determine  deoision  functions  9W  for  which  these  error-probabilities 
are  minimised  jointly  in  some  suitable  sense*  A  deoision  function 
9  is  called  admissible  if  there  is  no  other  for  whioh  all  corre¬ 
sponding  error-probabilities  are  at  least  as  small,  with  at  least 
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one  striotly  smaller.  Complete  classes,  minimal  essentially  com¬ 
plete  olasses,  etc.,  are  defined  correspondingly  (of*  Lindley  [26] 
and  Wolfowita  C  2?  3 • ) 

A  simple  neoessary  condition  that  ©'(x)  be  admissible  for  the 
estimation  problem  is  that  it  be  admissible  for  the  multideoision 
problem#  For  if  ©  is  better  than  ©  for  the  latter  problem, 
b(  j#©,©,W)2g  b(  for  all  (J,0),  with  at  least  one  inequality 

strict i  therefore  a(u, ©,©'*)  s  a(u,©,Ov)  for  all  (u,©),  with  at 
least  one  inequality  strict*  Thus  the  admissible  estimators  are  a 
subclass  (typically  a  relatively  small  one)  of  the  admissible 
multideoision  functions*  Similarly  every  essentially  complete  class 
of  multideoision  functions  contains  an  essentially  complete  class 
of  estimators* 

The  relations  between  the  estimation  and  multidecision  problems 
can  be  illustrated  further  in  terms  of  techniques,  related  to 
Bayes'  formula,  which  play  basic  roles  in  the  theory  of  each 
problem:  For  any  estimation  problem  specified  as  above,  let 
q  *  q(u,0)  be  an  arbitrary  real-valued  function  such  that 
q(u,©)  >  0  for  u,  ©  -  l,**.k;  any  such  function  will  be  called  a 
weight  function  (for  the  estimation  problem)*  For  any  such  q  and 
any  estimator  ©w,  we  define  the  (generalized)  Baves  risk: 

R(q,©*)  »  IZ  IZ  q(u,©)  a(u,©,©*)  . 

©=1  u=l 

On  the  other  hand,  for  any  multideoision  problem  specified  as 
above,  let  Q  *  ^(u,©)  g  0  be  an  arbitrary  weight-funotionj  then 
for  any  multideoision  funotion  ©*  the  corresponding  Bayes  risk  Is: 
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R»U,®*) 


For  any  given  0*  and  q(u,Q),  we  have 


t 


where 


r 

3  q(u,©),  for  j  >  o  , 

3>u>© 


Q(3*®)  * 


0  ,  for  3s*®# 


S  q(u,0),  for  3  <  ®  «> 
3§u<® 


For  each  0,  Q(3#®)  is  nondecreasing  in  3  for  j  >  8,  and  non¬ 
increasing  in  3  for  3  g  ®i  that  is,  '^(3,®)  has  a  single  relative 
minimum  which  it  assumes  on  one  or  more  consecutive  values  of  3 
including  3  *  ®*  Thus  each  weight-function  q(u,©)  for  the  esti¬ 
mation  problem  determines  uniquely  a  weight-function  Q(3,®)  for 
the  multideoision  problem,  which  has,  for  eaoh  9,  a  single  relative 
minimum*  Conversely  a  weight-function  Q( 3#®)  for  the  multideoision 
problem  having,  for  each  9,  a  single  relative  minimum  (in  the  pre- 
oeding  sense)  determines  uniquely  ( through  the  last  equation)  a 
unique  weight-function  q(u,9)  for  the  estimation  problem*  Thus  the 
Bayes  solutions  0*  for  the  estimation  problem  (i*e«  the  functions 
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0  whioh,  for  some  given  q ,  minimize  R(q,6  )  ore  a  subclass  of  the 
Bayes  solutions  for  the  raultideoision  problems,  characterised  by 
the  preceding  restriction  on  the  possible  forms  of  the  weight 
funotion  Q(u,0)  for  the  latter  problem* 

For  any  given  weight-function  q,  the  determination  of  Bayes 
estimators  is  conveniently  oarried  out  as  follows:  Let  Q  be 
determined  by  q  as  above#  Then  R(q,0*)  *  R f(Q,G*)  is  minimised  if, 
for  each  x,  0  (x)  takes  the  (a)  value  u  which  minimizes 

Q(u,o)  f(x,©)«  A  simple  sufficient  condition  for  admissibility 

of  an  estimator  is  that  it  be  an  essentially  unique  Bayes  solution 

in  the  sense  that  for  some  q  it  minimizes  R(q,©*),  and  every  other 

estimator  whioh  also  minimizes  R(q,©*)  has  the  same  risk-curves 

a(u,©)#  (A  related  sufficient  oondition  for  admissibility  is  that 

an  estimator  be  a  Bayes  solution  with  respect  to  each  of  the  weight 

functions  q^,#.#qr<Bl,  and  that  among  all  suoh  estimators  it  is  an 

essentially  unique  Bayes  solution  with  respect  to  some  q  • )  An- 

r 

other  simple  sufficient  oondition  for  admissibility  is  that  an 
estimator  be  a  Bayes  solution  with  respeot  to  some  q  whioh  is 
positive  for  all  u,Q#  Every  admissible  estimator  is  a  Bayes 
solution  with  respeot  to  some  q$  and  the  class  of  Bayes  solution 
with  respect  to  weight-functions  q  is  a  complete  class  of  estimates 

Various  speoiflc  formulations  of  the  estimation  problem  ©an  be 
exhibited  as  special  oases  of  the  present  formulation.  For  example 
let  W(j,Q)  denote  the  loss  function  adopted  in  any  deoislon- 
theoretio  formulation:  the  loss  incurred,  if  0  is  true  and  it  is 
inferred  that  0  »  J,  is  equal  to  W( J,0),  Then  use  of  any  estimator 
0  leads,  when  9  is  true,  to  tbs  expeoted  loss 
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E[W(8*{X) 


b(J,8,8*)  W(J,8) 


r(8,8*)  , 


a  real-valued  rlek  function  (of  0).  To  illustrate  the  frequently 
adopted  apeoifioation  that  losses  are  proportional  to  the  squared 
error  of  the  estimate,  we  replaoe  the  convenient  labels  0  * 
by  the  more  general  parameter  values  ©  »  ©2, . ..©k,  where 

9^  <  and  write  W(u,©i)  *  0(0^  (u  -  0^)  ,  where  u  is  any 

val-ie  in  the  range  of  ©*.  (The  expected  mean-squared  error  can 
generally  be  reduced  further  by  dropping  the  restriction  that  the 
range  of  0  be  the  range  of  ©^$  the  conflict  between  these  con¬ 
siderations  disappears  in  typical  problems  where  the  range  of  0  is 

an  interval.)  For  any  a  priori  probabilities  *  Prob  (0^)# 

1  *  l,Mtk,  any  estimator  9*  gives  the  Bayes  risk 

g  Vtv**)  ‘ip  V<#1>  2=  .«*)(“  -  V* 

*  R»U,©*>  *  R(q,©*)  , 


2 

where  Q  =  Q(u,©A)  *  gic(©i)(u  -  ©A)  i  q(u,©A)  is  determined  by  Q,  as 
above.  Numerous  examples  are  treated  (without  restrictions  on  JX  ) 
in  the  texts  and  researoh  literature  of  decision  theory. 

A  simple  loss  function  for  the  estimation  problem  is  one  of 
the  form 


if  O^©)  <  j  <  ©2(©)  , 
if  J  S  8^8)  , 
if  3  i  e2(»)  , 


Mb*r« 
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o1(0)  5  0,  9^9)  9£(9J,  9^9)  <  «2(fi)  for  0  «  1,2,. ..,k. 

This  gives  the  risk  function 

r<9,9*)  »  o^Oja^©),©,©*)  +  o2(9)a(92(0),9,9*)  . 

If  a  priori  probabilities  g(9)  are  adopted,  then  the  Bayes  risk, 
with  the  use  of  9*,  is 


jy  g(©)[c1(0)a(©1(0),0,0^)  +  o2(«)a(02(©),0,0*)] 


R*(Q,9*)  »  R(q,9*)  , 


where 


q  «  q(u,C) 


g(«)01(0),  if  u*  ©3^(0), 
g(0)o2(«),  if  u  *  02(©), 
0,  otherwise. 


and  Q( j,0)  Is  determined  by  q  as  above. 

The  methods  of  Sections  6-9  above  oan  be  characterized  in  the 
present  terms  as  follows:  Writing 


R(q#®“)  88  >  I  2 _ q(u,9)a(u,9,0  )  +  >  q(u+l.Q)a(u+1.9.9  )  , 

u  I  ©>u  9<u 


for  each  u  the  summand  oan  be  interpreted  as  a  linear  combination, 
with  coefficients  q  >  0,  of  the  various  probabilities  of  errors  of 
Types  I  and  II  given  by  a  test  of  the  one-sided  hypothesis  H(u) : 

9  $  u,  against  H*(u):  9  >  u,  where  the  test  has  the  aooeptanoe 
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region  A(u)  *£x|0*(x)  §  uj  •  In  other  words,  eaoh  suoh  term 
(with  index  u)  is  the  Bayes  risk  in  a  oertain  one-sided  testing 
problem^  it  is  minimized  by  a  suitable  aooeptance  region  A(u) 
(determined  by  a  technique  equivalent  to  the  Neyman- Pearson  lemma); 
suoh  Bayes  aooeptanoe  regions  are  admissible  under  mild  conditions* 
If  the  estimation  problem  has  a  suitably  simple  structure,  and  if 
the  weight-function  q  is  a  suitable  one,  then  the  acceptance 
regions  A(u)  will  oonstltute  a  nondeoreasing  sequence  in  u;  in  such 
oases,  the  Bayes  risk  in  the  estimation  problem  can  be  minimized  by 
minimizing  simultaneously  eaoh  of  the  mentioned  terms  with  respect¬ 
ive  indices  u  =  l,**#,k#  The  Bayes  estimator  obtained  in  suoh 
oases  is: 


©*(x) 


u,  if  x  e  A(u)  -  A(u-l),  for  u  «  2,« • • ,k 
^1,  if  x  e  A(l)  • 


It  is  problems  having  this  structure  whioh  are  treated  in  Section 
6-9  above  (without  the  restriction  that  j\. be  finite)#  The  method 
of  Section  8  is  represented  by  the  form  assumed  by  R(q,0',‘)  for  the 
special  case  of  a  simple  loss-function,  defined  as  above;  in  suoh 
oases  the  minimization  of  a  term  of  R  with  index  u  corresponds  to 
use  of  the  Neyman-Pearson  lemma  to  determine  a  best  aooeptanoe 
region  A(u)  for  testing  between  two  simple  hypotheses* 

If /'Lis  not  finite,  after  choosing  any  finite  subset  f\?  c.  f\ 
(more  or  less  "representative”  of  /l  )  we  can  apply  the  above  simple 
computational  methods  to  determine  Bayes  estimators  of  0  c  * 

If  for  any  q,  the  Bayes  estimator  0*  of  «  e  ji ?  is  determined 
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essentially  uniquely  on  the  sample  spaoe  (up  to  sets  having 

probability  0  for  all  ©  e  j"l).  then  ©#  is  an  admissible  estimator 

* 

of  ©  e  <ru.  In  this  way,  elementary  techniques  oan  provide  a 
number  of  admissible  estimators  illustrative  of  the  variety  to  be 
found  in  the  full  admissible  class* 

11*  An  application  of  the  general  theory  i  estimators  having 
prescribed  precision  in  a  specified  region  i  sequential  probability 
ratio  estimators# 

It  is  sometimes  desired  that  an  estimator  have  high  preoision 
in  some  interval  in  the  parameter  space,  while  in  the  remainder  of 
the  parameter  spaoe  much  lower  precision  would  suffice.  In  general 
efficient  achievement  of  such  a  specification  requires  use  of  an 
estimator  based  on  a  sequential  sampling  rule.  One  formulation 
and  solution  of  such  a  problem  is  the  following;  for  illustrative 
purposes,  a  concrete  example  is  discussed. 

Let  Y^Yg,...  be  independent  Bernoulli  trials,  with 
Prob  (Y^  *  1)  »  ©,  Prob  (Y^  *=  0)  *  (1  -  ©)•  An  estimator  ©**  is 
required  which  will  have  high  precision  for  0  near  .5.  This 
requirement  may  be  formulated  in  part  as  follows:  For  ©  *  *ij.  or 
.6,  the  probability  is  at  least  .95  that  ©  will  be  closer  to  the 
correct  one  of  these  two  values;  in  terms  of  risk  curves  of 
estimators,  we  require  essentially  that  a(.5,.l|.,©  }  §  .0$  and 

•  •  i  « 

a(.5,.6,©  )  §  .0$.  (Further  interpretations  of  these  requirements 
in  relation  to  the  general  notion  of  preoision  will  appear  below. ) 
To  meet  these  requirements,  consider  any  estimator  ©*,  and  consider 
the  test  of  the  one-sided  hypothesis  H:  ©  g  .5  against  H» :  ©  >  «*> 
given  by  the  aooeptanoe  region  |x|©**(x)  g  »$j  .  (The  description 


70 


of  the  sample  spaoe  on  whioh  our  estimators  are  defined  remains  to 

be  speoified.)  The  requirements  to  be  met  by  ©*  imply  that  this 

test  has  error-probabilities  not  exoeeding  .Of?  when  ©  *  «4  and 

©-  *  ,6,  If  sequential  sampling  rules  are  allowed,  it  is  known  that 

the  last  oondition  is  satisfied  most  efficiently,  in  terms  of 

expected  number  of  observations  Y^  required  under  ©  «  «1|  and 

©  *  .6,  by  Waldfs  sequential  probability  ratio  test  [28] •  (We 

discuss  such  tests  ignoring  "excess  at  termination";  in  problems  of 

the  type  being  considered,  this  entails  that  some  of  the  following 

equations  represent  close  approximations;  for  certain  problems,  no 

suoh  qualification  is  neoessary.)  The  indicated  sampling  rule  is: 

Observe  compute  after  each  observation  Y^  the  sum 

m 

^  and  h  -  hlm.dj  -  2^-m,  and  terminate  observation  as 
soon  as  either  h  =  k  a  log  (19)/log  (3/2)  or  h  *  -k.  The  restating 
sample  space  is  S  =  £x|x  ■  ( y ^ * *3Tn ) 9  n  *  1,2,...;  [h(m,drt)  |  <  k 
or  «  k  as  ra  <  n  or  m  *  n|  »  The  conditions  specified  above  are  met 
(with  minimum  expected  sample  sises  under  all  values  of  0)  by  use 
of  this  sampling  rule  and  any  definition  of  ©*'(x)  whioh  satisfies: 

0  (x)  §  .5  for  x  suoh  that  h  *  -k 
©**(x)  >  .5  for  x  such  that  b»k« 

The  definition  of  ©*(x)  can  be  completed  so  as  to  make  it 
admissible  and  median-unbiased.  (Because  S  is  discrete,  use  of  an 
auxiliary  randomisation  variable  is  neoessary  to  obtain  exact 
median-unbiasedness;  we  omit  suoh  randomization,  obtaining  an 
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admissible  estimator  which  is  approximately  median-unbiased*) 

Every  estimator  satisfying  the  preoedlng  inequalities  is  a  Bayes 
solution  for  the  above  stated  problem*  given  the  sample  space  S. 

The  determination  of  an  admissible  estimator  among  these  ean  be 
Interpreted  as  an  illustration  of  the  teohnique  of  using  a  sequence 
of  a  priori  distributions;  and  of  choosing*  among  all  Bayes 
solutions  for  the  first  such  distribution,  one  which  minimizes  the 
Bayes  risk  for  the  second  such  distribution* 

We  have 


S*(x*0)  *  dn/0  -  (n-dn)/(l-$) 

«  d^/Od-O)  -  n/(l-0) 

Cn(§  -  ©)/0(l-©)  +  k/2©(l-0),  if  h  »  k  * 

(n(|  -  ©)/©(l-©)  -  k/2©(l-©)*  if  h  *  -k  . 

For  any  fixed  0Q  <  S(x,©0)  is  an  increasing  function  of  n  as  x 

varies  subject  to  h  ®  -k;  and  the  set  of  such  points  has  probability 
exceeding  ■jj  when  ©  =  ©Q*  To  determine  a  test  of  H( )  s  ©  ;g  ©Q 
against  H!(©0):  ©  >  ©0*  with  acceptance  region  £x|t©*(x)  jg  ©Q^  * 
having  size  l/2*  and  having  the  property  that  it  is  a  locally-best 
test  of  this  form  subject  to  the  conditions  already  imposed  upon 
©*(x)#  it  is  necessary  and  sufficient  that  ©*(x)  satisfy  the 
following  additional  condition:  Let  n(©Q)  be  determined  by 

Prob  (h  *  -k  and  n  5  n(©0)f©0)  jy  * 
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In  general,  this  relationship  ean  be  satisfied  only  approximately, 
but  always  olosely  exoept  for  ©Q  very  near  0,  Then 

V 

©"^(x)  g  ©0  for  x  such  that  h  *  -k  and  n  §  n(©Q)  $ 

tot 

©  (x)  >  ©Q  otherwise. 

As  ©Q  increases  from  0  to  n(©Q)  takes  successively  the  values 

k,k  +  1,  k  +  2,«.«  . 

Proceeding  similarly  for  any  fixed  0Q  >  £,  we  define  n(©0) 
similarly  for  suoh  values,  and  obtain  the  conditions 

©*(x)  g  ©0  for  x  such  that  h  »  k  and  n  >  n(©0)  , 

©*(x)  >  ©Q  otherwise  • 


It  is  dear  that  all  of  these  conditions  on  ©  (x)  can  be  met 
simultaneously  (allowing  the  approximations  mentioned),  and  that 
they  provide  a  full  definition  of  the  estimator.  Sinoe  this 
definition  depends  on  x  only  through  n  =  n(x)  and  h  -  h(x)  -  £  k, 
©  depends  on  x  only  through  t  =  t(x)  =  h/kn.  The  range  of  t  is 
t  1,  i  l/2,  1  l/3,««.  and  9*  is  an  increasing  function  of  t* 

Let  F(t,©)  =  Prob  ^t(X)  g  t|©J  ,  then  the  estimator 
©*  a  ©*(x,#5)  is  defined  as  the  root  ©  of  the  equation 

•  •  » 

v(x,©,«5)  s  F(t(x),e)  -.5*0, 
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More  generally,  for  eaoh  a,  0  <  a  <  1,  a  confidence  limit  esti¬ 
mator  Q*(xta.)  is  defined  as  the  root  ©  of  v(x,0,a)  *  0*  (The 
admissibility  of  suoh  estimators  oan  be  shown  as  above*)  The 
family  of  suoh  estimators  constitutes  an  admissible  confidence 
curve  estimator* 

Confidence  curve  estimates  of  this  kind  will  be  narrow, 
reflecting  high  precision,  when  n  is  very  large,  and  will  be  wide 
reflecting  low  precision,  when  n  is  very  small*  It  follows  from 
the  requirements  imposed  upon  ©*‘(x,*5)  above  that  whenever 
0  (x,*$)  >  *5#  we  have  ©"(x,*95)  >  *ij.  (whether  n  is  small  or  large), 
and  that  whenever  ©*(x,.5)  <  *5,  we  have  ©*(x,.05)  <  *6;  henoe  the 

90  percent  confidence  interval  J(x)  =*  [©*(x,*95>),  ©*(x,*05)]  will 

#  •  • 

never  include  both  the  values  0  •  *1|  and  ©  *  .6*  (The  event 
n(x)  *  +oo,  which  has  probability  0  under  each  0,  gives 
J(x)  *  [*4*«6]  and  0^(x,*5)  *  •$•)  This  constitutes  a  useful  inter* 
pretation  of  the  formulation  adopted  above  of  the  general  require¬ 
ment  of  high  precision  for  ©  near  *5* 

For  praotioal  reasons,  it  is  sometimes  necessary  to  terminate 
sampling  before  this  is  indicated  by  the  above  sampling  rule,  and 
the  question  arises  what  inferences  can  be  made  validly  on  the 
basis  of  suoh  partial  determination  of  an  observation  x*  Term¬ 
ination  after  m  observations  with  iMmjd^)  \  <  k  is  equivalent  to 
observation  of  the  event  -l/m  <  t(x)  <  l/m#  For  eaoh  a,  this 
implies  that  the  estimate  ©  (x,a)  (which  would  have  been  determined 
by  continuing  sampling)  satisfies  ©*(x,a)  <  ©*(x,a)  <  tf*(x,a), 
where  ©*(x,a)  are  respectively  the  roots  ©  of  F(-l/m,©)  *  a  and 


of  F(l/m,Q)  *  a*  These  bounds  on  an  estimate  narrow  progressively 
with  increasing  m*  When  suoh  bounds  on  an  estimate  (or  confidence 
curve)  beooxne  sufficiently  narrow  for  the  purpose  at  hand,  sampling 

can  be  terminated  without  affecting  the  validity  of  the 

« 

(approximate)  estimates  obtained* 

Conoerning  the  computation  of  values  of  P(t,0)  required  for 
use  of  suoh  estimators,  the  function  F(0,9)  of  ©  is  the  operating 
characteristic  function  of  a  sequential  probability  ratio  test,  on 
which  there  is  an  extensive  theoretioal  and  quantitative  literature 
for  a  wide  range  of  problems*  For  each  ©,  when  F(0,0)  is  known, 
the  determination  of  F(t,0)  is  reduced  to  the  problem  of  deter¬ 
mining  the  conditional  cumulative  distribution  function  of  n 
(the  number  of  observations  required  for  termination  of  sampling, 
or  the  duration  of  a  random  walk  with  two  absorbing  barriers)  on 
the  condition  of  termination  with  h  =  -k  ( "aooeptance  of  H:  ©  g  .5" 
or  absorption  at  the  lower  boundary),  and  again  on  the  condition 
of  termination  with  h  =  k  ("rejection  of  H",  or  absorption  at  the 
upper  boundary)*  (The  unconditional  distribution  of  n,  together 
with  one  of  these  conditional  distributions  and  F(0,©),  determines 
the  other  conditional  distribution* ) 


SCHEMATIC  ILLUSTRATIONS  OP  CONFIDENCE  CURVE  ESTIMATES 
OF  A  BINOMIAL  PARAMETER  9  HAVING  HIGH  PRECISION  FOR  9  NEAR  \ 
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(A)  n(x)  very  small,  h(x)  ■  -k 


(B)  n(x)  very  large,  h(x)  »  -k 


’j 


(C)  Bounds  on  estimate*  sampling  curtailed  with  ra  very  large* 


o(0,x) 


0 
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